Metadata transcoding

ABSTRACT

The present document relates to transcoding of metadata, and in particular to a method and system for transcoding metadata with reduced computational complexity. A transcoder configured to transcode an inbound bitstream comprising an inbound content frame and an associated inbound metadata frame into an outbound bitstream comprising an outbound content frame and an associated outbound metadata frame is described. The inbound content frame is indicative of a signal encoded according to a first codec system and the outbound content frame is indicative of the signal encoded according to a second codec system. The transcoder is configured to identify an inbound block of metadata from the inbound metadata frame, the inbound block of metadata associated with an inbound descriptor indicative of one or more properties of metadata comprised within the inbound block of metadata, and to generate the outbound metadata frame from the inbound metadata frame based on the inbound descriptor.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a divisional of U.S. patent application Ser. No.14/761,892, filed on Jul. 17, 2015, which is the U.S. national stage ofInternational Patent Application No. PCT/US2014/011695, filed on Jan.15, 2014, which in turn claims priority to U.S. Provisional PatentApplication No. 61/754,893, filed on Jan. 21, 2013, each of which ishereby incorporated by reference in its entirety.

TECHNICAL FIELD

The present document relates the transcoding of metadata. In particular,the present document relates to a method and system for transcodingmetadata with reduced computational complexity.

BACKGROUND

Various single-channel and/or multi-channel audio rendering systems suchas 5.1, 7.1 or 9.1 multi-channel audio rendering systems are currentlyin use. The audio rendering systems allow e.g. for the generation of asurround sound originating from 5+1, 7+1 or 9+1 speaker locations,respectively. For an efficient transmission or for an efficient storingof the corresponding single-channel or multi-channel audio signals,audio codec (encoder/decoder) systems such as Dolby Digital (DD) orDolby Digital Plus (DD+) are being used.

There may be a significant installed base of audio rendering deviceswhich are configured to decode audio signals which have been encodedusing a particular audio codec system (e.g. Dolby Digital). Theparticular audio codec system may be e.g. referred to as a second audiocodec. On the other hand, the evolution of audio codec systems may leadto an updated audio codec system (e.g. Dolby Digital Plus), which may bee.g. referred to as a first audio codec system. The updated audio codecsystem may provide additional features (e.g. an increased number ofchannels) and/or improved coding quality. As such, content providers maybe inclined to provide their content in accordance to the updated audiocodec system.

Nevertheless, the user having audio rendering device with a decoder ofthe second audio codec system should still be able to render the audiocontent which has been encoded in accordance to the first audio codecsystem. This may be achieved by a so called transcoder or converterwhich is configured to convert the audio content which is encoded inaccordance to the first audio codec system into modified audio contentwhich is encoded in accordance to the second audio codec system.

A further need for transcoding may arise along the distribution chain ofaudio content. Audio content may be encoded by a content provider usingan audio codec which is well suited for the production and thebroadcasting of audio content (such as the Dolby E audio codec). Theaudio content may be distributed using this production-oriented audiocodec, and the audio content may be transcoded in accordance to a secondaudio codec (such as the lossless codec Dolby TrueHD or such as theDolby Digital Plus or the Dolby Digital codec).

The audio content is typically associated with metadata which is encodedin the bitstream representing the audio content. Usually, the audiocontent is split up in a sequence of frames, where each frame of audiocontent comprises a pre-determined number of samples (e.g. 1024samples). A frame of the sequence of frames may be associated with arespective container or frame of metadata. The container of metadata maybe indicative of information describing the frame of audio content thatthe container is associated with. An example for such informationdescribing the frame may be loudness data regarding some or all of thesamples of the frame. Alternatively or in addition, the container ofmetadata may be used to transmit auxiliary data which may not bedirectly associated with the corresponding frame of audio content. Suchauxiliary data may e.g. be used to provide a decoder of an audio codecsystem with a firmware upgrade.

In addition to transcoding the audio content from a first audio codecsystem to a second audio codec system, the transcoder typically alsoneeds to transcode the associated metadata. In order to reduce the costof transcoders/converters (which are implemented e.g. within settopboxes), the computational complexity of the conversion between a firstaudio codec system and a second audio codec system should be relativelylow. This should also be the case for the transcoding of the metadata.In the present document, methods and systems for transcoding aredescribed, which allow for the transcoding of metadata with a reducedcomputational complexity.

SUMMARY

According to an aspect a transcoder configured to transcode an inboundbitstream into an outbound bitstream is described. The inbound bitstreammay comprise an inbound content frame and an associated inbound metadataframe. The associated inbound metadata frame may be comprised within theinbound bitstream directly subsequent or directly preceding the inboundcontent frame. As such, the term “associated” may indicate a temporalrelationship between a content frame and a metadata frame (e.g. the termmay indicate that a content frame directly precedes a metadata frame orvice versa). It should be noted that in some embodiments, the associatedinbound metadata frame may be comprised within the inbound contentframe. A content frame typically comprises a first element (e.g. asynchronization field) and a last element (e.g. an error correctionfield such as a CRC field). The associated metadata frame may bepositioned in a field of the content frame which is arranged subsequentto the first element of the content frame and prior to the last elementof the content frame (e.g. in an auxiliary data field of the contentframe).

The metadata frame may be a so called evolution frame. Typically, theinbound bitstream comprises a sequence of inbound content frames and anassociated sequence of inbound metadata frames. The inbound metadataframes are typically interleaved with the inbound content frames, suchthat a particular inbound content frame is directly followed by itsassociated metadata frame. In a similar manner to the inbound bitstream(also referred to as an encoded inbound bitstream), the outboundbitstream (or encoded outbound bitstream) may comprise an outboundcontent frame and an associated outbound metadata frame. In particular,the outbound bitstream may comprise a sequence of outbound contentframes and a sequence of outbound metadata frames which are interleaved.

The content frames may be indicative of a signal encoded according to aparticular codec scheme. In particular, the inbound content frame may beindicative of a signal encoded according to a first codec system and theoutbound content frame may be indicative of the signal encoded accordingto a second codec system. The first and second audio codec systems maybe the same (in which case, the transcoder may be configured to providea bit-rate conversion) or the first and the second audio codec systemsmay be different (in which case, the transcoder may be configured toprovide a codec conversion). The signal may comprise an audio signal.Examples for the first and second codec systems are Dolby E, DolbyDigital Plus, Dolby Digital, Dolby TrueHD, Dolby Pulse, AAC (AdvancedAudio Coding) and/or HE-AAC (High Efficiency-AAC). In case of differentfirst and second codec systems, the transcoder may be configured totranscode the signal content from the first codec system to the secondcodec system. Alternatively or in addition, a bit-rate of the outboundbitstream may differ from a bit-rate of the inbound bitstream, and thetranscoder may be configured to perform a transcoding of the encodedsignal content from a first bit-rate to a second (different) bit-rate.

The signal is typically represented as a sequence of frames comprising apre-determined number of samples of the signal (e.g. 512 or 1024 samplesof the signal). As such, the inbound content frame may be indicative ofsome or all samples of a frame of the signal. The outbound content framemay be indicative of some or all samples of the same frame of thesignal. As such, the transcoder may be configured to generate anoutbound content frame which is indicative of at least some of thesamples of the corresponding inbound content frame.

For transcoding the inbound bitstream into the outbound bitstream, thetranscoder may comprise a decoder which is configured to decode theinbound bitstream in accordance with the first codec system. As a resultof the decoding, the decoder may provide a set of PCM samples for eachcontent frame. Furthermore, the decoder may be configured to extract themetadata from the metadata frames. The decoded inbound bitstream (e.g.the sets of PCM samples and the extracted metadata) may be provided toan encoder which is configured to encode the signal in accordance to thesecond codec system, thereby providing the outbound bitstream. As such,the transcoder may be configured to generate the outbound content framefrom the inbound content frame using a decoder of the first codec systemand an encoder of the second codec system. The transcoder may comprise aso called PCM-connected transcoder, where the decoder passes sets of PCMsamples to the encoder of the transcoder. As such, the transcoderdescribed herein may comprise the features described in the context of aPCM-connected transcoder.

It should be noted that the content frame may also be indicative ofmetadata in accordance to the underlying codec system. In other words,the content frame may comprise metadata associated with the signalcomprised within the content frame, wherein the metadata comprisedwithin the content frame is defined by the underlying codec system (i.e.the first or the second codec system). In contrast to this, the metadataframes allow for the transport of additional metadata (in addition tothe metadata specified by the codec systems). Examples for such metadataare loudness or dialnorm parameters or auxiliary data such as firmwareupgrades for a decoder within an audio content distribution chain.

The metadata frames may follow a pre-determined syntax. In particular,the inbound metadata frame and the outbound metadata frame may follow acommon syntax. The syntax for metadata frames may allow a metadata frameto comprise zero, one or more blocks of metadata. Each block of metadatamay comprise metadata of a particular type. As such, a metadata framemay have a variable size, depending on the amount of metadata and/or thenumber of metadata blocks, which are incorporated into the metadataframe. Each block of metadata may be indicative of (or may comprise) acorresponding descriptor indicative of one or more properties of themetadata comprised within the corresponding block of metadata. Inparticular, the descriptor may describe properties which indicate howthe metadata of the block may be or should be manipulated. As such, thedescriptor of a block may be used by the transcoder to transcode theblock(s) comprised in the inbound metadata frame in a computationalefficient manner.

For transcoding of a metadata frame, the transcoder may be configured toidentify an inbound block of metadata from the inbound metadata frame.An inbound block may be identified using a block identifier. By way ofexample, each block of a metadata frame may be identified using a blockidentifier. Furthermore, the metadata frame may comprise a particularblock identifier indicative of the fact that the metadata frame does notcomprise any further blocks (referred to e.g. as an end identifier). Theend identifier may be used by the transcoder to determine that themetadata frame does not comprise any further metadata blocks.

As indicated above, the inbound block of metadata may be associated witha descriptor, referred to as an inbound descriptor. The inbounddescriptor may be indicative of one or more properties of metadatacomprised within the inbound block of metadata. The descriptor may bewritten into a data field of the block of metadata. An example propertycomprised within the descriptor is a timestamp parameter which isindicative of a sample of the signal. In particular, the timestampparameter may indicate that the metadata of the inbound block isassociated with (e.g. is to be applied to) the sample of the signal,which is identified by the timestamp parameter. The timestamp parametermay identify the sample by indicating the position of the sample withina content frame relative to the end or relative to the beginning of thecontent frame. A further example is a duration parameter indicative of anumber of samples of the signal. The duration parameter may indicatethat the metadata of the inbound block is associated with the number ofsamples of the signal indicated by the duration parameter (starting fromthe sample indicated by the timestamp parameter). In particular, theduration parameter may indicate that the metadata is to be applied to anumber of samples subsequent to the sample indicated by the timestampparameter, wherein the number of samples is indicated by the durationparameter. The timestamp and/or duration parameters may be used e.g. toindicate for which samples of the signal encoded in the associatedinbound content frame, the metadata (e.g. a loudness value) of theinbound block is applicable. By way of example, the inbound metadataframe may comprise a plurality of inbound blocks indicative of differentloudness values for different groups of samples of the signal encoded inthe inbound content frame.

Another example of a property indicated (or comprised) within thedescriptor is a transcode parameter indicative of whether or not theinbound block is to be transcoded into the outbound bitstream. By way ofexample, the transcoder parameter may be used to indicate that themetadata comprised within the inbound block is only applicable for thefirst codec system. As such, the transcoder may be configured to dropthe metadata comprised within the inbound block, if the outboundbitstream is encoded in accordance to a second codec system which isdifferent from the first codec system.

A further example of a property comprised within the descriptor is aduplicate parameter indicative of whether the metadata of the inboundblock is to be included in every outbound metadata frame which isgenerated from the inbound metadata frame. In a similar manner, ade-duplicate parameter may be used as a property which is indicative ofwhether the metadata of the inbound block is to be discarded by thetranscoder, if the outbound metadata frame is generated from a pluralityof inbound metadata frames. The duplicate and/or de-duplicate parametersmay be used by the transcoder in situations where the framing of theinbound bitstream and the outbound bitstream differs.

A further example of a property is a priority parameter which isindicative of an importance of the metadata of the inbound block,relative of one or more other inbound blocks of metadata. The priorityparameter may be used by the transcoder in situations, where only areduced amount of metadata can be inserted into the outbound bitstreamcompared to the inbound bitstream. Another example of a property is anassociation parameter indicative of whether or not the metadata of theinbound block may be inserted into a delayed outbound metadata framesubsequent to the outbound metadata frame. As such, the associationparameter provides the transcoder with additional flexibility in thetranscoding process, as the transcoder may decide in an efficient manneron which inbound blocks may be delayed and on which inbound blocks haveto be maintained in association with the associated content frames.

Another example for a property is a PCM processing parameter indicativeof whether or not the metadata of the inbound block is to be discardedby the transcoder, subject to a modification of data comprised withinthe inbound content frame. In particular, the PCM processing parametermay indicate to the transcoder that the metadata of the inbound block isto be included into the outbound metadata frame, even if the data of theinbound content frame (e.g. the samples of the signal comprised withinthe inbound content frame) has been modified. This may be the case, e.g.when the inbound block comprises a payload such as binary data or suchas an additional bitstream, which is unrelated to the data comprisedwithin the inbound content frame. The PCM processing parameter isparticularly relevant for so called PCM-connected transcoders.

A preferred inbound descriptor comprises at least an indication onwhether a timestamp parameter and/or a duration parameter are comprisedwithin the descriptor. Furthermore, a preferred inbound descriptorcomprises a duplicate and a de-duplicate parameter.

The transcoder may be configured to generate the outbound metadata framefrom the inbound metadata frame based on the inbound descriptor. Inparticular, the transcoder may be configured to generate the outboundmetadata frame from the inbound metadata frame only based on the one ormore properties indicated by the inbound descriptor. Even moreparticularly, the transcoder may be configured to generate the outboundmetadata frame from the inbound metadata frame without analyzing themetadata comprised within the inbound block. As such, the transcoder mayperform transcoding of the metadata comprised within a metadata framesolely based on the descriptors of the blocks of metadata, without theneed to analyze and/or interpret the metadata carried by the blocks ofmetadata. This results in a transcoder having a significantly reducedcomputational complexity.

The transcoder may be configured to generate the outbound metadata framefrom the inbound metadata frame by copying the metadata from the one ormore inbound blocks of the inbound metadata frame to corresponding oneor more outbound blocks. The one or more outbound blocks may be insertedinto the outbound metadata frame. The copying and inserting may besubjected to the one or more properties indicated by the inbounddescriptor(s) of the one or more inbound blocks. By way of example, theassociation parameter may indicate to the transcoder that a particularinbound block is to be inserted into the outbound metadata frame. On theother hand, the transcoder parameter may indicate to the transcoder thatthe particular inbound block should be dropped, if the second codecsystem is different from the first codec system.

The transcoder may be configured to generate the outbound metadata frameby generating an outbound descriptor of the outbound block based on theinbound descriptor of the inbound block. In particular, the outbounddescriptor may comprise or may be indicative of some or all of theproperties indicated by the inbound descriptor. Some or all of theproperties of the inbound descriptor may be copied to the outbounddescriptor. On the other hand, the transcoder may be configured tomodify one or more of the properties indicated by the inbound descriptorfor generating the outbound descriptor, wherein the outbound descriptoris indicative of the one or more modified properties. By way of example,the inbound descriptor may be indicative of a timestamp parameter. Thetimestamp parameter may be modified by the transcoder such that themodified timestamp parameter indicates the same sample of the signal asthe original timestamp parameter, even though the transcoder may haveperformed a re-framing of the outbound bitstream with respect to theinbound bitstream.

As indicated above, the one or more properties of the inbound descriptormay comprise a timestamp parameter indicative of a sample of the signal,which the metadata of the inbound block is associated with. Thetimestamp parameter of the inbound descriptor typically indicates thesample of the signal relative of the inbound content frame. Thetranscoder may be configured to generate an outbound block from theinbound block. Furthermore, the transcoder may be configured to generatean outbound descriptor of the outbound block by modifying the timestampparameter of the inbound descriptor such that the correspondingtimestamp parameter of the outbound descriptor indicates the sample ofthe signal relative of the outbound content frame (which may have adifferent framing than the inbound content frame). As such, thetranscoder may be configured to ensure that the one or more propertiesindicated by the inbound descriptor remain valid, even when the inboundbitstream is subjected to re-framing.

The transcoder may be configured to insert the outbound block (generatedfrom the inbound block of the inbound metadata frame) into a delayedoutbound metadata frame. By way of example, the association parameter ofthe inbound descriptor may indicate to the transcoder that the inboundblock may be delayed. The transcoder may chose to insert the metadatainto a delayed outbound metadata frame (e.g. due to a limited bit-rateof the second bitstream). The delayed outbound metadata frame may beassociated with a delayed outbound content frame which does not comprisethe sample of the signal which is indicated by the timestamp parameterof the inbound block. In order to ensure that, nevertheless, thetimestamp parameter of the outbound block identifies the correct sampleof the signal, the transcoder may be configured to generate the outbounddescriptor of the outbound block by modifying the timestamp parameter ofthe inbound block such that the timestamp parameter of the outbounddescriptor indicates the sample of the signal relative to the delayedoutbound content frame. By way of example, the modified timestampparameter may indicate a sample number which exceeds the number ofsamples of the delayed content frame, thereby indicating that the sampleof the signal lies outside of the delayed content frame.

As indicated above, the one or more properties of the inbound descriptormay comprise a duplicate parameter indicative of whether the metadata ofthe corresponding inbound block is to be included in every outboundmetadata frame which is generated from the inbound metadata frame. Thetranscoder may be configured to generate a plurality of outboundmetadata frames from the inbound metadata frame, by taking into accountthe duplicate parameter. In particular, the transcoder may be configuredto determine that the duplicate parameter indicates that the metadata ofthe inbound block is to be included in every outbound metadata framewhich is generated from the inbound metadata frame. In such a case, thetranscoder may be configured to insert the metadata of the inbound blockinto each of the plurality of outbound metadata frames. In particular,the transcoder may be configured to generate an outbound block from theinbound block for each of the plurality of outbound metadata frames. Inaddition to generating a plurality of outbound metadata frames, thetranscoder may be configured to generate a plurality of outbound contentframes from the inbound content frame, wherein the plurality of outboundcontent frames may be associated with the plurality of outbound metadataframes, respectively.

The duplicate parameter may comprise a flag which may be set to indicatethat the metadata of the inbound block is to be included in everyoutbound metadata frame which is generated from the inbound metadataframe, or vice versa (i.e. the flag may be set to indicate the contraryinstead).

As indicated above, the one or more properties of the inbound descriptormay comprise a de-duplicate parameter indicative of whether the metadataof the inbound block may be (or is to be) discarded by the transcoder,if the outbound metadata frame is generated from a plurality of inboundmetadata frames. The transcoder may be configured to generate theoutbound metadata frame from a plurality of inbound metadata frames ofthe inbound bitstream, by taking into account the de-duplicateparameter. In particular, the plurality of inbound metadata frames maycomprise a plurality of inbound blocks of metadata, each inbound blockbeing associated with a respective de-duplicate parameter indicatingthat the metadata of the inbound block may be discarded by thetranscoder. The transcoder may be configured to discard the metadata ofthe plurality of inbound blocks for all but one of the plurality ofinbound metadata frames (e.g. for all but the first one of the pluralityof inbound metadata frames), for generating the outbound metadata frame.In addition to generating the outbound metadata frame from a pluralityof inbound metadata frames, the transcoder may be configured to generatethe outbound content frame from a plurality of inbound content frames,wherein the plurality of inbound content frames are associated with theplurality inbound metadata frames, respectively.

The de-duplicate parameter may comprise a flag which may be set toindicate that the metadata of the inbound block may be (or is to be)discarded by the transcoder, if the outbound metadata frame is generatedfrom a plurality of inbound metadata frames, or vice versa (i.e. theflag may be set to indicate the contrary instead).

As indicated above, the one or more properties of the inbound descriptormay comprise a priority parameter indicative of a relative importance ofthe metadata of the inbound block relative of one or more other inboundblocks of metadata. The inbound metadata frame received at thetranscoder may comprise a plurality of inbound blocks with descriptorsindicating different values for the priority parameter. The transcodermay be configured to generate the outbound metadata frame from theplurality of inbound blocks in accordance to the priority parameters ofthe plurality of inbound blocks. In particular, the transcoder may firstselect the inbound block(s) having the highest relative priority andonly insert the lower priority inbound blocks, if sufficient bit-rate isavailable for the outbound bitstream.

The plurality of inbound blocks may be associated with incrementalpriority parameters indicating incremental priorities. The plurality ofinbound blocks may comprise incremental metadata, such that the combinedmetadata of the plurality of inbound blocks provides high qualitymetadata and such that the metadata of the inbound block having thehighest relative priority from the plurality of inbound blocks providesreduced quality metadata (i.e. providing metadata with a quality whichis reduced compared to the high quality metadata provided by thecombined metadata). The inbound block with the next lower priority mayprovide an increase of the quality of the metadata and so on, until thehighest quality of metadata is provided when combining the completeplurality of inbound blocks. The transcoder may be configured togenerate the outbound metadata frame based on at least one or more ofthe plurality of inbound blocks, thereby allowing for a scalabledegradation of the quality of the metadata comprised within the outboundmetadata frame. The degree of degradation may e.g. be based on theavailable bit-rate of the outbound bitstream.

As indicated above, the one or more properties of the inbound descriptormay comprise an association parameter indicative of whether or not themetadata of the inbound block may be inserted into a delayed outboundmetadata frame subsequent to the outbound metadata frame. The transcodermay be configured to insert the metadata from the inbound block into theoutbound metadata frame, based on the association parameter and/or basedon bit-rate restrictions on the outbound bitstream. In particular, thetranscoder may be configured to insert the metadata from the inboundblock into a delayed outbound metadata frame subsequent to the outboundmetadata frame, if the association parameter indicates that the metadataof the inbound block may be delayed.

According to a further aspect, a method for transcoding an inboundbitstream comprising an inbound content frame and an associated inboundmetadata frame into an outbound bitstream is described. The outboundbitstream may comprise an outbound content frame and/or an associatedoutbound metadata frame. The inbound content frame may be indicative ofa signal encoded according to a first codec system and the outboundcontent frame may be indicative of the signal encoded according to asecond codec system. As indicated above, the first and second codecsystems may be the same or may be different. The method may compriseidentifying an inbound block of metadata from the inbound metadataframe. The inbound block of metadata may be associated with an inbounddescriptor indicative of one or more properties of metadata comprisedwithin the inbound block of metadata. Furthermore, the method maycomprise generating the outbound metadata frame from the inboundmetadata frame based on the inbound descriptor. In other words, theoutbound metadata frame may be determined by considering the inbounddescriptor, typically without the need to further analyze the metadatacomprised within the inbound metadata frame.

According to another aspect, an encoded bitstream comprising a contentframe and an associated metadata frame is described. The content framemay be indicative of a signal encoded according to a first codec system.The metadata frame may comprise a block of metadata and the block ofmetadata may be associated with (or may comprise) a descriptorindicative of one or more properties of metadata comprised within theblock of metadata.

According to a further aspect, an encoder configured to generate anencoded bitstream comprising a content frame and an associated metadataframe is described. The content frame may be indicative of a signalencoded according to a codec system. The encoder may be configured togenerate a block of metadata. Furthermore, the encoder may be configuredto determine a descriptor associated with the block of metadata. Thedescriptor may be indicative of one or more properties of the metadatacomprised within the block of metadata. Furthermore, the encoder may beconfigured to insert the block of metadata into the metadata frame. Itshould be noted that the features described in the present document inthe context of a transcoder are also applicable to a correspondingencoder.

In particular, the one or more properties may comprise a timestampparameter indicative of a sample of the signal, which the metadatacomprised in the block of metadata is associated with. The sample of thesignal may be comprised within the content frame. The encoder may beconfigured to insert the block into a delayed metadata frame, whereinthe delayed metadata frame is associated with a delayed content framewhich does not comprise the sample of the signal. Furthermore, theencoder may be configured to generate the descriptor of the block ofmetadata such that the timestamp parameter of the descriptor indicatesthe sample of the signal relative of the delayed content frame. As such,the encoder may be configured to delay the transmission of metadata andto modify the timestamp parameter accordingly, thereby smoothening thebit-rate of the bitstream generated by the encoder.

According to an aspect, a corresponding decoder is described. Thedecoder may comprise any of the decoder related features described inthe present document. The decoder may be configured to decode an encodedbitstream comprising a content frame and an associated metadata frame.As outlined above, the content frame is indicative of a signal encodedaccording to a first codec system. The metadata frame may comprise ablock of metadata, wherein the block of metadata is associated with (orcomprises) a descriptor indicative of one or more properties of metadatacomprised within the block of metadata. The decoder may be configured todecode the encoded signal comprised within the content frame. Inparticular, the decoder may comprise a decoder of the first codec systemto decode the encoded signal. As a result, the decoder may be configuredto provide a set of PCM samples of the encoded signal.

Furthermore, the decoder may be configured to identify the block ofmetadata from the metadata frame and to extract the descriptor from theblock of metadata. In addition, the decoder may be configured to processthe metadata comprised within the block of metadata in dependence on theone or more properties indicated by the descriptor. The one or moreproperties may correspond to any one or more of the properties describedin the present document. The decoder may be configured to associate aparticular property of the metadata with corresponding processing of themetadata. By way of example, the descriptor may be indicative of atimestamp parameter, thereby informing the decoder that the metadata ofthe block of metadata is to be applied to a particular sample of thesignal. As such, the decoder may be configured to apply the metadata tothe sample indicated by the timestamp parameter. As another example, thedescriptor may be indicative of an association parameter. If theassociation parameter indicates that the block of metadata is unrelatedto the content frame, the decoder may be configured to pass the metadatacomprised within the block of metadata to another processing unit (whichdeals e.g. with auxiliary data comprised within the block metadata).

According to a further aspect, a method for decoding an encodedbitstream comprising a content frame and an associated metadata frame isdescribed. The content frame may be indicative of a signal encodedaccording to a first codec system. The metadata frame may comprise ablock of metadata, wherein the block of metadata may be associated witha descriptor indicative of one or more properties of metadata comprisedwithin the block of metadata. The method may comprise decoding theencoded signal comprised within the content frame. Furthermore, themethod may comprise identifying the block of metadata from the metadataframe and extracting the descriptor from the block of metadata. Inaddition, the method may comprise processing the metadata comprisedwithin the block of metadata based on the one or more propertiesindicated by the descriptor.

According to another aspect, a method for generating an encodedbitstream comprising a content frame and an associated metadata frame isdescribed. The content frame may be indicative of a signal encodedaccording to a codec system. The method may comprise generating a blockof metadata. Furthermore, the method may comprise determining adescriptor associated with the block of metadata, wherein the descriptoris indicative of one or more properties of the metadata comprised withinthe block of metadata. In addition, the method may comprise insertingthe block of metadata into the metadata frame.

According to a further aspect, an encoder configured to generate anencoded bitstream comprising a content frame and an associated metadataframe is described. The content frame may be indicative of a signalencoded according to a first codec system. The encoder may be configuredto generate a block of metadata. In a preferred embodiment, the block ofmetadata comprises a descriptor as described in the present document.The descriptor may be indicative of one or more properties of themetadata comprised within the block of metadata.

The encoder may be configured to insert the block of metadata into themetadata frame. Furthermore, the encoder may be configured to select asecure key from a plurality of pre-determined secure keys. The pluralityof pre-determined secure keys may be configured such that it providesdifferent levels of trust. In particular, the plurality ofpre-determined secure keys may comprise a highly secure key known onlyto a developer of the encoder (or of a corresponding decoder or of acorresponding transcoder comprising a decoder and an encoder).Furthermore, the plurality of pre-determined secure keys may comprise amoderate secure key known to an operator of the encoder (or of acorresponding decoder or of a corresponding transcoder comprising adecoder and an encoder).

The encoder may be configured to generate a cryptographic value based atleast on the content frame, on the associated metadata frame and on theselected secure key. In particular, the encoder may be configured tocalculate an HMAC-MD5 value or an HMAC-SHA256 value (Secure HashAlgorithm as specified in the Federal Information Processing StandardFIPS PUB 180-2) for generating the cryptographic value. In addition, theencoder may be configured to truncate the HMAC-MD5 or HMAC-SHA256 valueto yield the cryptographic value. By truncating the HMAC value, theoverhead required for the cryptographic value may be reduced. Theencoder may be configured to insert the generated cryptographic valueinto the metadata frame, thereby ensuring that the content frame and/orthe metadata frame cannot be modified by an unauthorized party withoutbeing detected.

The use of different secure keys which provide different levels of trustensures that a corresponding decoder (or a transcoder comprising adecoder) can verify whether the received bitstream has been modified andif yes, which party has modified the received bitstream. By way ofexample, the encoder may have initially generated the bitstream usingthe highly secure key. An intermediate party may have modified thebitstream and may have used the moderate secure key to generate amodified cryptographic value. As such, the decoder is aware that thereceived bitstream has been modified by a party having access to themoderate secure key. It should be noted that the plurality ofpre-determined secure keys may comprise more than two levels of trust,thereby providing a decoder with more details regarding thetrustworthiness of a received bitstream.

The encoder may be configured to insert an indication of the selectedsecure key into the metadata frame, thereby enabling the correspondingdecoder to easily verify whether the received bitstream has beenmodified. On the other hand, the decoder may be configured to verify theauthenticity of the received bitstream using all of the plurality ofpre-determined secure keys, without a need for the indication of theselected secure key to be inserted into the metadata frame.

The encoder may be configured to generate a plurality of succeedingcontent frames and associated metadata frames for the encoded bitstream.Furthermore, the encoder may be configured to generate a framecryptographic value based on a single content frame and its associatedmetadata frame and based on the selected secure key. The framecryptographic value may be inserted into the associated metadata frameand may be used by a corresponding decoder (or transcoder) to verify theauthenticity of an individual content/metadata frame. Furthermore, theencoder may be configured to generate a historic cryptographic valuebased on at least some of the plurality of succeeding content frames andtheir associated metadata frames, and based on the selected securevalue. The historic cryptographic value may be inserted into one of theplurality of succeeding metadata frames and may be used by thecorresponding decoder (or transcoder) to verify the correct sequentialorder of the plurality of succeeding content frames and metadata frames.

According to a further aspect, a method for generating an encodedbitstream comprising a content frame and an associated metadata frame isdescribed. The content frame may be indicative of a signal encodedaccording to a first codec system. The method may comprise generating ablock of metadata and inserting the block of metadata into the metadataframe. Furthermore, the method may comprise selecting a secure key froma plurality of pre-determined secure keys, wherein the plurality ofpre-determined secure keys provides different levels of trust. Inaddition, the method may comprise generating a cryptographic value basedat least on the content frame, on the associated metadata frame and onthe selected secure key. The generated cryptographic value may then beinserted into the metadata frame.

According to a further aspect, a corresponding decoder is described. Thedecoder may be configured to receive an encoded bitstream comprising acontent frame and an associated metadata frame. The encoded bitstreammay have any one or more of the properties described in the presentdocument. In particular, the content frame may be indicative of a signalencoded according to a first codec system (e.g. a codec system asreferred to in the present document). The decoder may be configured toextract a cryptographic value from the metadata frame. The cryptographicvalue may have been inserted into the metadata frame by a correspondingencoder, as described in the present document. In particular, thecryptographic value may have been determined using one of a plurality ofpre-determined secure keys. As outlined above, the plurality ofpre-determined secure keys may provide different levels of trust. By wayof example, the plurality of pre-determined secure keys may comprise ahighly secure key and a moderate secure key.

The decoder may be configured to determine a secure key from theplurality of pre-determined secure keys. In particular, the decoder maybe configured to determine the secure key by extracting the secure keyfrom the metadata frame (e.g. from a particular field of the metadataframe). Furthermore, the decoder may be configured to generate averification cryptographic value based at least on the received contentframe, on the received associated metadata frame and on the determinedsecure key. In addition, the decoder may be configured to compare theextracted cryptographic value and the verification cryptographic value,in order to determine whether the received encoded bitstream can betrusted. By way of example, if the extracted cryptographic value and theverification cryptographic value match, the decoder may determine thatthe received encoded bitstream can be trusted. Furthermore, the securekey used for generating the verification cryptographic value mayindicate to the decoder the level of trust which is associated with thereceived encoded bitstream. By way of example, the highly secure key mayindicate a higher level of trust than the moderate secure key.

The decoder may be configured to determine which one of the plurality ofpre-determined secure keys has been used to generate the extractedcryptographic value. As indicated above, the secure key, which has beenused to generate the extracted cryptographic value, may provide anindication of the level of trust of the received encoded bitstream. Inparticular, the decoder may be configured to generate a plurality ofverification cryptographic values for the plurality of pre-determinedsecure keys, respectively. Furthermore, the decoder may be configured tocompare each one of the plurality of verification cryptographic valueswith the extracted cryptographic value. In addition, the decoder may beconfigured to determine that one of the plurality of pre-determinedsecure keys has been used to generate the extracted cryptographic value,if the comparison shows that one of the plurality of verificationcryptographic values matches the extracted cryptographic value.

According to a further aspect, a method for determining a level of trustof a received encoded bitstream comprising a content frame and anassociated metadata frame is described. The content frame may beindicative of a signal encoded according to a first codec system. Themethod may comprise extracting a cryptographic value from the metadataframe. Furthermore, the method may comprise determining a secure keyfrom a plurality of pre-determined secure keys, wherein the plurality ofpre-determined secure keys provides different levels of trust. Inaddition, the method may comprise generating a verificationcryptographic value based at least on the content frame, on theassociated metadata frame and on the determined secure key. The methodmay proceed in comparing the extracted cryptographic value and theverification cryptographic value to determine a level of trust for thereceived encoded bitstream, wherein the level of trust may be indicatedby the determined secure key.

According to another aspect, a transcoder configured to transcode aninbound bitstream comprising an inbound content frame and an associatedinbound metadata frame into an outbound bitstream is described. Thetranscoder may comprise any of the transcoder related features describedin the present document. As outlined above, the inbound bitstream may beindicative of a set of samples of a signal, e.g. indicative of thesamples of a frame of the signal. The transcoder may comprise a decoder(for decoding the received inbound bitstream) and an encoder (forre-encoding the decoded inbound bitstream to provide the transcodedoutbound bitstream). The transcoder may comprise a so calledPCM-connected transcoder.

The decoder of the transcoder may be configured to convert the inboundcontent frame into a set of decoded PCM samples of the signal.Furthermore, the decoder may be configured to extract metadata from theinbound metadata frame. As such, the decoder may be configured toconvert the inbound bitstream into a sequence of decoded PCM samples andassociated metadata. The sequence of decoded PCM samples and theassociated metadata may be used by the encoder of the transcoder togenerate the outbound bitstream (in accordance to the second codecsystem). The decoder may be configured to generate a signature value forthe set of decoded PCM samples and the extracted metadata, using adecoder secure key. The signature value may be generated using anHMAC-MD5 or an HMAC-SHA256 hash function. The resulting value may betruncated to provide the signature value. As such, the decoder may beconfigured to provide a signature value, thereby enabling the encoder toverify whether the decoded PCM samples and/or the metadata has beenmodified by an unauthorized entity (which does not have access to thedecoder secure key) between the decoder and the encoder of thetranscoder.

The encoder of the transcoder may be configured to receive a set of PCMsamples and associated metadata. The received set of PCM samplestypically corresponds to the set of decoded PCM samples provided by thedecoder and the received metadata typically corresponds to the extractedmetadata from the decoder. However, the PCM samples and/or the metadatamay have been modified, such that the received set of PCM samples and/orthe received metadata may differ from the set of decoded PCM samplesand/or the extracted metadata.

The encoder may be configured to receive a signature value. The receivedsignature value may correspond or may be equal to the signature valuegenerated by the decoder. On the other hand, the received signaturevalue may be different from the signature value generated by the decoder(e.g. if modified by an authorized entity, subject to modification ofthe PCM samples and/or the metadata). The encoder may be configured toverify whether the received signature value is valid for the receivedset of PCM samples and associated metadata, using an encoder secure key.Furthermore, the encoder may be configured to generate an outboundcontent frame of the outbound bitstream from the received set of PCMsamples and generate an associated outbound metadata frame of theoutbound bitstream from the received metadata, if the received signatureis valid. On the other hand, the encoder may be configured to preventthe insertion of the received metadata into the outbound bitstream, ifthe received signature is not valid. As such, the encoder of thetranscoder may be configured to prevent the insertion of metadata framesinto the outbound bitstream, if the metadata or the PCM samples havebeen modified by an unauthorized entity.

It should be noted that the decoder and the encoder used within atranscoder are typically provided by the developer of the respectiveaudio codec system. As such, the functionality of the decoder and theencoder may be controlled by the developer, thereby ensuring a highquality of audio content and associated metadata. On the other hand, thePCM samples and/or metadata between the decoder and the encoder may bemodified by an unauthorized entity, thereby presenting a risk that thequality of audio content and/or metadata is reduced. By providing adecoder which generates signature values and by providing an encoderwhich verifies signature values, it can be ensured that the unauthorizedmodification of PCM samples and/or metadata can be detected.

The encoder may be configured to use the decoder secure key as theencoder secure key. As such, it is ensured that the encoder can verifywhether the received PCM samples and metadata correspond to the PCMsamples and metadata provided by the corresponding decoder.

The transcoder may further comprise a PCM processing stage which isconfigured to modify the set of decoded PCM samples and/or the extractedmetadata, thereby yielding a set of second PCM samples and associatedsecond metadata. The set of second PCM samples may correspond to the setof decoded PCM samples or to the set of modified PCM samples. In asimilar manner, the second metadata may correspond to the extractedmetadata or to the modified extracted metadata. Furthermore, the PCMprocessing stage may be configured to pass the set of second PCM samplesand associated second metadata to the encoder. Using the receivedsignature value and the encoder secure key, the encoder may beconfigured to detect that the decoded PCM samples and/or the extractedmetadata has been modified by the PCM processing stage. In other words,the encoder may be configured to detect that the second PCM samples(received by the encoder) and the second metadata (received by theencoder) do not correspond to the decoded PCM samples and the extractedmetadata (provided by the decoder).

The transcoder may further comprise a re-signing unit which isconfigured to determine an updated signature value for the set of secondPCM samples and associated second metadata, using a re-signing securekey. Furthermore, the re-signing unit may be configured to pass theupdated signature value to the encoder. The re-signing secure key may bedifferent from the decoder secure key. The encoder may be configured touse the re-signing secure key as the encoder secure key. As such, theencoder may be configured to detect that the PCM samples and/orassociated metadata have been modified by an authorized PCM processingstage. In a similar manner to the cryptographic values described in thepresent document, the secure keys for the signature values may beselected from a plurality of pre-determined secure keys. By way ofexample, the decoder secure key may be a highly secure key, whereas there-signing secure key may be a moderate secure key, thereby providingdifferent levels of trust for the PCM samples and/or associated metadatareceived at the encoder of the transcoder.

The encoder of the transcoder may comprise a PCM processing stageconfigured to modify the set of received PCM samples and/or the receivedmetadata. The set of received PCM samples may correspond to the set ofdecoded PCM samples or to the set of second PCM samples. In a similarmanner, the received metadata may correspond to the extracted metadataor to the second metadata. The encoder may be configured to generate theoutbound content frame and/or the outbound metadata frame based on themodified set of received PCM samples and/or modified received metadata,which have been modified by the PCM processing stage of the encoder. Byproviding the encoder of the transcoder with a PCM processing stage, itcan be ensured that a chain of trust is maintained within the transcoder(as the PCM processing is performed within the encoder provided by adeveloper of the encoder).

As indicated above, the PCM-connected transcoder may comprise any of thefeatures described in the present document. In particular, the decoderof the transcoder may be configured to identify an inbound block ofmetadata from the inbound metadata frame. As outlined in the presentdocument, the inbound block of metadata may be associated with aninbound descriptor indicative of one or more properties of metadatacomprised within the inbound block of metadata. The one or moreproperties may be any one or more of the properties described in thepresent document. The encoder of the transcoder may be configured togenerate the outbound metadata frame from the inbound metadata frame atleast based on the inbound descriptor.

In particular, the one or more properties may comprise a PCM processingparameter indicative of whether or not the metadata of the inbound blockis to be discarded by the encoder, subject to a modification of the setof PCM samples and/or of the extracted metadata. In such cases, theencoder of the transcoder may be configured to include or not includethe inbound block into the outbound metadata frame based on the value ofthe PCM processing parameter. In particular, the encoder may beconfigured to include the metadata of the inbound block into theoutbound metadata frame, if the PCM processing parameter indicates thatthe metadata of the inbound block should not be discarded, even if theset of PCM samples and/or the extracted metadata has been modified. Thismay be useful, e.g. in situation where the metadata comprised within theinbound block is independent of the set of PCM samples (as is the casee.g. for auxiliary data or binary data).

The PCM processing stage of the transcoder may be configured to providean indication of one or more PCM processing conditions to the encoder ofthe transcoder. The one or more PCM processing conditions may indicatehow the set of PCM samples and/or how the extracted metadata has beenprocessed by the PCM processing stage. By way of example, the one ormore PCM processing conditions may comprise one or more of: conversionof a sampling rate of the set of PCM samples, mixing of the PCM sampleswith a system sound, modification of the extracted metadata,modification of a channel configuration of the set of PCM samples (incase of an audio signal), leveling of the loudness of the set of PCMsamples. The encoder may then be configured to generate the outboundmetadata frame from the inbound metadata frame also based on the one ormore PCM processing conditions. In particular, the encoder may beconfigured to decide on whether to include or to not include the inboundblock into the outbound metadata frame, based on the value of the PCMprocessing parameter and based on the one or more PCM processingconditions. In particular, the PCM processing parameter may indicate howto process the inbound block, subject to one or more PCM processingconditions.

According to a further aspect, a method for transcoding an inboundbitstream comprising an inbound content frame and an associated inboundmetadata frame into an outbound bitstream is described. The inboundbitstream may be indicative of a set of samples of a signal. The methodmay comprise converting the inbound content frame, at a decoder, into aset of decoded PCM samples of the signal. Furthermore, the method maycomprise extracting metadata, at the decoder, from the inbound metadataframe. Furthermore, a signature value for the set of decoded PCM samplesand the extracted metadata may be generated, using a decoder secure key.The set of decoded PCM samples, the extracted metadata and the generatedsignature value may be passed to a corresponding encoder. In addition,the method may comprise receiving a set of PCM samples and associatedmetadata, and receiving a signature value, at the encoder. The methodmay proceed in determining whether the received signature value is validfor the received set of PCM samples and associated metadata, using anencoder secure key. Subsequently, an outbound content frame of theoutbound bitstream may be generated from the received set of PCM samplesand an associated outbound metadata frame of the outbound bitstream maybe generated from the received metadata, if the received signature isvalid.

According to further aspects, the above mentioned decoder and encoder ofa PCM-connected transcoder are described stand-alone. The decoder and/orthe encoder may comprise any one or more of the decoder and/or encoderrelated features described in the present document, respectively. Thedecoder and/or the encoder may be used in a transcoder (as describeabove). Alternatively or in addition, the decoder and/or the encoder maybe used stand-alone. As such, according to a further aspect, a decoderconfigured to decode an inbound bitstream comprising an inbound contentframe and an associated inbound metadata frame is described. The inboundbitstream may be indicative of a set of samples of a signal. The decodermay be configured to convert the inbound content frame into a set ofdecoded PCM samples of the signal. Furthermore, the decoder may beconfigured to extract metadata from the inbound metadata frame. Inaddition, the decoder may be configured to generate a signature valuefor the set of decoded PCM samples and for the extracted metadata, usinga decoder secure key. As such, the set of decoded PCM samples and theassociated extracted metadata may be protected using a signature value.The signature value may be used by a receiving party of the set ofdecoded PCM samples and of the associated extracted metadata to verifywhether the set of decoded PCM samples and/or the associated extractedmetadata has been modified in an unauthorized manner. The receivingparty may be an encoder which is configured to re-encode the set ofdecoded PCM samples and the associated extracted metadata into anoutbound bitstream. Hence, the decoder may be configured to send the setof decoded PCM samples, the extracted metadata and the generatedsignature value to an encoder for re-encoding.

According to another aspect, an encoder configured to encode an outboundbitstream comprising an outbound content frame and an associatedoutbound metadata frame is described. The encoder may be configured toreceive a set of PCM samples and associated metadata, and to receive asignature value for the set of PCM samples and associated metadata. Thereceived set of PCM samples may correspond to (or may have been derivedfrom) the above mentioned set of decoded PCM samples). In a similarmanner, the received set of associated metadata may correspond to (ormay have been derived from) the above mentioned set of extractedmetadata. The received signature value may have been determined (e.g. ata decoder) using the above mentioned set of decoded PCM samples and theextracted metadata.

The encoder may be configured to verify whether the received signaturevalue is valid for the received set of PCM samples and associatedmetadata, using an encoder secure key. Furthermore, the encoder may beconfigured to generate an outbound content frame of the outboundbitstream from the received set of PCM samples and to generate anassociated outbound metadata frame of the outbound bitstream from thereceived metadata, if the received signature is valid.

According to another aspect, a method for decoding an inbound bitstreamcomprising an inbound content frame and an associated inbound metadataframe is described. The inbound bitstream may be indicative of a set ofsamples of a signal. The method may comprise converting the inboundcontent frame into a set of decoded PCM samples of the signal, andextracting metadata from the inbound metadata frame. Furthermore, themethod may comprise generating a signature value for the set of decodedPCM samples and for the extracted metadata, using a decoder secure key.In addition, the method may comprise providing the set of decoded PCMsamples, the extracted metadata and the generated signature value to anencoder for re-encoding.

According to a further aspect, a method for encoding an outboundbitstream comprising an outbound content frame and an associatedoutbound metadata frame is described. The method may comprise receivinga set of PCM samples and associated metadata, and receiving a signaturevalue for the set of PCM samples and associated metadata. Furthermore,the method may comprise verifying whether the received signature valueis valid for the received set of PCM samples and associated metadata,using an encoder secure key. The method may proceed in generating anoutbound content frame of the outbound bitstream from the received setof PCM samples and in generating an associated outbound metadata frameof the outbound bitstream from the received metadata, if the receivedsignature is valid.

According to a further aspect, a software program is described. Thesoftware program may be adapted for execution on a processor and forperforming the method steps outlined in the present document whencarried out on the processor.

According to another aspect, a storage medium is described. The storagemedium may comprise a software program adapted for execution on aprocessor and for performing the method steps outlined in the presentdocument when carried out on the processor.

According to a further aspect, a computer program product is described.The computer program may comprise executable instructions for performingthe method steps outlined in the present document when executed on acomputer.

It should be noted that the methods and systems including its preferredembodiments as outlined in the present patent application may be usedstand-alone or in combination with the other methods and systemsdisclosed in this document. Furthermore, all aspects of the methods andsystems outlined in the present patent application may be arbitrarilycombined. In particular, the features of the claims may be combined withone another in an arbitrary manner.

SHORT DESCRIPTION OF THE FIGURES

The invention is explained below in an exemplary manner with referenceto the accompanying drawings, wherein

FIG. 1a shows a block diagram of an example audio content distributionchain comprising a transcoder;

FIG. 1b shows an example structure of a metadata frame;

FIGS. 2a and 2b show examples of a timestamp property used in a metadataframe;

FIGS. 3a and 3b show examples of a de-duplication property used in ametadata frame;

FIGS. 4a and 4b show examples of a duplication property used in ametadata frame; and

FIGS. 5a, 5b, 5c, 5d and FIG. 6, which includes FIGS. 6a and 6b , showexample PCM-connected transcoders.

DETAILED DESCRIPTION

As outlined in the background section, audio content is typicallyassociated with metadata and encoded into a joint bitstream comprising asequence of encoded content frames and an associated sequence ofmetadata containers (also referred to as metadata frames). FIG. 1 showsa block diagram of an example distribution system 100 for audio content.The methods and systems described in the present document are outlinedin the context of audio content. It should be noted, however, that themethods and systems are applicable to other types of content, such asvideo content. In more general terms, the methods and systems describedin the present document are applicable to multi-media content, such asaudio and/or video, wherein the multi-media content is associated withmetadata.

The distribution system 100 comprises an encoder 101 which is configuredto encode the audio content and provide an encoded bitstream 110 (alsoreferred to as the first encoded bitstream 110 or the inbound encodedbitstream 110). The first encoded bitstream 110 typically comprises asequence of encoded content frames 111, wherein an encoded content frame111 may be associated with a corresponding metadata frame 112. Theencoder 101 is configured to provide a first encoded bitstream 110 whichis encoded in accordance to a first audio codec system. The first audiocodec system may e.g. be one of: Dolby E, Dolby Digital Plus, DolbyDigital Dolby True HD, Dolby Pulse, AAC and/or HE-AAC. The contentframes 111 may represent or may comprise a pre-determined number ofsamples of the audio content, e.g. 1536, 1024 or 512 samples of theaudio content.

The first encoded bitstream 110 is provided to a transcoder 103 via atransmission medium or via a storage medium 102. The transcoder 103 isconfigured to transcode or convert the first encoded bitstream 110 intoa second encoded bitstream 120 (also referred to as an outboundbitstream 120), wherein the second encoded bitstream 120 is encoded inaccordance to a second audio codec system. The second audio codec systemmay be different from the first audio codec system. On the other hand,the second audio codec system may be the same as the first audio codec,but use a different configuration e.g. a different bit-rate, a differentframe-rate and/or a different channel configuration. The second audiocodec system may e.g. be one of: Dolby E, Dolby Digital Plus, DolbyDigital Dolby True HD, Dolby Pulse, AAC and/or HE-AAC. In a similarmanner to the first encoded bitstream 110, the second encoded bitstream120 comprises a sequence of content frames 121 and a correspondingsequence of metadata frames 122. The content frames 121 of the secondencoded bitstream 120 may have a frame size which is different from theframe size of the content frames 111 of the first encoded bitstream 110.The second encoded bitstream 120 may be provided to a decoder 104 forrendering of the audio content.

The metadata frames 112, 122 may have a pre-determined structure. Inother words, the metadata frames 112, 122 may follow a pre-determinedsyntax. By way of example, the metadata frames 112, 122 may follow theso called evolution frame syntax illustrated in Table 1. The evolutionframe syntax may e.g. be used in the context of standardized multi-mediacontent codec systems such as the Digital Video Broadcast (DVB) systemand/or a Moving Picture Expert Group (MPEG) codec system. It should benoted that the metadata frame syntax shown in Table 1 and the followingtables is only an example. Variations or modifications of the syntax arepossible. In particular, the syntax shown in the present document may beextended by additional fields, e.g. for providing additionalfunctionalities.

TABLE 1 Syntax No. of bits Comments evo_frame( ) {  key_id =variable_bits (3);  while ((id =payload_id) != END) { 5   if (payload_id== 31) {    payload_id += variable_bits (5);   }   payload_config( );  payload_size = variable_bits (8);   payload (payload_id,payload_size);  }  protection( ); }

The semantics of the parameters of the evolution frame shown in Table 1may be as follows:

-   -   key_id may be an identifier of the cryptographic key used for        hashing (i.e. used for calculating the protection_bits of the        protection( ) field).    -   payload_id may be an identifier of the following application        payload; a payload_id END=“0000b” may have the meaning that no        further payload is contained in this evo_frame( );    -   payload_size may indicate the number of bytes in the following        payload field.

The evolution frame syntax specifies a metadata frame 112, 122 which maycomprise a plurality of blocks of metadata, wherein a block of metadatais also referred to as a playload. As such, a metadata frame 112, 122may comprise zero, one or more blocks of metadata, wherein each block ofmetadata is indicative of a particular type and/or a particular aspectof metadata. Example for types of metadata are

-   -   descriptive metadata which describes particular aspects of the        content frame 111 that the metadata frame 112 is associated with        (e.g. tempo and/or harmonic information);    -   unrelated metadata which comprises auxiliary data, which is not        directly related to the content frame 111 (such as firmware        upgrades for a target decoder of the encoded audio content);    -   control metadata which may be used to control the rendering of        one or more samples of the content frame 111 that the metadata        frame 112 is associated with (e.g. loudness values for one or        more samples of the content frame 111).

As such, the metadata frame 112 provides a flexible structure which canbe expanded by additional blocks of metadata as needed, in order todescribe additional characteristics of the encoded audio content or inorder to transmit additional auxiliary data within the bitstream 110. Incase no metadata is to be transmitted along with a content frame 111,the metadata frame 112 may comprise no block of metadata, which may beindicated in the syntax of the evolution frame of Table 1 by apayload_id which corresponds to a pre-determined “END” ID (identifier).

In the present document, it is proposed to add a descriptor to a blockof metadata, wherein the descriptor describes one or morecharacteristics or properties of the metadata comprised within theassociated block of metadata. This descriptor is referred to as“payload_config( )” in the syntax of the evolution frame shown inTable 1. The descriptor may be used by a transcoder to perform anefficient transcoding of the block of metadata, without the need toanalyze the metadata comprised within the associated block of metadata.As a result of this, the complexity of the transcoding of metadata canbe significantly reduced.

In other words, the present document describes methods of transcodingblocks of metadata (also referred to as payloads) within a metadataframe 112 (e.g. within the evolution frame shown in Table 1) from onecoded bitstream to another bitstream. The transcoding operations may beguided by specific fields within each payload (e.g. the field“payload_config( )” of a block of metadata, as shown in Table 1). Thetranscoding operations may then be specified such that the individualpayloads can be appropriately transcoded from one coded stream toanother coded stream, without the need to extract or interpret theessence of the underlying metadata parameters of the block of metadata(i.e. without the need to extract or interpret the essence of theunderlying payloads).

FIG. 1b shows an example structure of a metadata frame 130 (e.g. themetadata frame 112). The metadata frame 130 may comprise a frame header131 which is indicative of generic information regarding the structureof the metadata frame 130 and the association of the metadata frame 130with a content frame 111 of the coded bitstream 110. The frame header131 may comprise some or all of the fields of the evolution frame ofTable 1 which are not related to the payloads of the frame. Furthermore,the metadata frame 130 may comprise one or more blocks 140 of metadata(also referred to as metadata payload 140). A block 140 of metadata maycomprise a block header 141, which may be indicative of the size of theblock 140 of metadata (referred to as payload_size in Table 1).Furthermore, the block 140 of metadata may comprise a descriptor 142(referred to as payload_config ( ) in Table 1), wherein the descriptor142 may be indicative of the type of metadata and/or of one or moreproperties of the metadata, which is comprised in the data field 143(i.e. the payload( ) shown in Table 1) of the block 140 of metadata.

An example descriptor 142 of a block 140 of metadata for an evolutionframe, i.e. an example “payload_config( )” field, is shown in Table 2.It can be seen that the descriptor 142 may comprise or may be indicativeof one or more properties of the metadata comprised within the block140. In the example of Table 2, the properties are

-   -   a timestamp parameter indicative of a sample of the audio        content, to which the metadata of the block 140 is applicable.        The timestamp may indicate a sample which is comprised within        the content frame 111 that is associated with the metadata frame        112 of the block 140. Alternatively or in addition, the        timestamp may be configured to take on sufficiently large        values, to indicate a sample which is comprised within a content        frame that is succeeding the content frame 111 which is        associated with the metadata frame 112 of the block 140.    -   a duration parameter indicative of the number of samples        (starting from the sample indicated by the timestamp), for which        the metadata of the block 140 is applicable.    -   a transcoding flag (referred to as a “don't transcode” flag in        Table 2) which provides an instruction to a transcoder on        whether or not to transcode the block 140 of metadata. If the        “don't transcode” flag is set, the transcoder may simply ignore        or remove the block 140 of metadata when transcoding the inbound        bitstream 110. This may be useful in case of metadata which is        relevant only for the first codec system of the inbound        bitstream 110, and does not make sense for any other codec        system to which the bitstream 110 may be transcoded (as is the        case e.g. for a cyclic redundancy check (CRC) which is generated        over data comprised within the inbound bitstream 110. A CRC        typically only makes sense, if the encoded data are not        modified, so that there is no need to transcode the CRC). In        more general terms, the transcoding flag may be used to identify        metadata that is only useful during the decode process of the        inbound bitstream within the transcoder (and therefore not        required for the subsequent re-encode process for generating the        outbound bitstream).    -   a duplicate flag which provides an instruction to a transcoder        on whether or not to duplicate the metadata comprised within the        block 140, when the size of the content frame 111 prior and        subsequent to transcoding differs.    -   a de-duplicate flag which provides an instruction to a        transcoder on whether or not to remove duplicates of the        metadata comprised within the block 140, when the size of the        content frame 111 prior and subsequent to transcoding differs.    -   a priority parameter which provides an indication of the        relative importance of the metadata comprised within the block        140. The transcoder may use the priority parameter to select one        or more blocks 140 from a metadata frame 130, e.g. if the        allowed bit-rate of the transcoded second bitstream 120 is        reduced with respect to the bit-rate of the first bitstream 110.    -   an association flag (referred to as the “now_or_never” flag in        Table 2) which provides an indication to the transcoder on        whether or not the metadata comprised within the block 140 is        associated with the corresponding content frame 111. As such, if        the “now_or_never” flag is set, the transcoder is aware of the        fact that the metadata comprised within the block 140 should        either be transcoded immediately or should be dropped (as the        “now_or_never” flag indicates that the decoder cannot use the        metadata if the metadata is delayed).

TABLE 2 Syntax No. of bits Comments payload_config( ) { timestamp_present; 1  if (timestamp_present) {   timestamp =variable_bits (11);  }  duration present; 1  if (duration_present) {  duration = variable_bits (11);  }  dont_transcode; 1  if(!dont_transcode) {   duplicate; 1   deduplicate; 1   priority; 5  now_or_never; 1   tight_coupling 2  } }

In other words, the semantics of the property parameters of thedescriptor 142 shown in Table 2 may be as follows:

-   -   a timestamp parameter indicating the offset in samples from the        beginning of the content frame 111 to which the payload 143 in        question belongs;    -   a duration parameter indicating the time in samples for which        the payload 143 in question remains valid;    -   a dont_transcode flag that signals whether the payload 143 in        question must be discarded when transcoding (flag=1) or whether        transcoding can occur (flag=0);    -   a duplicate flag that—when set to 1—signals that the payload 143        in question needs to be repeated during transcoding so that it        appears in the transcoded blocks 140 between timestamp and        timestamp+duration. The duplicate flag may be set, e.g. for        loudness data to indicate that frames have the same dialnorm. In        more general terms, the duplicate flag may be set for metadata        that do not have a notion of time. The duplicate flag is        typically not set for data that supports the concept of time by        itself, like e.g. the bitstream of a codec. In other words,        metadata that is internally timed may not be provided with a        duplicate and/or de-duplicate flag which is set, wherein the        term “internally timed” means that only the exact sequence of        blocks of metadata is meaningful, i.e. a repetition or        de-duplication would invalidate the metadata. An example for        metadata which is internally timed is a different bitstream        (different from the content comprised within the content frames)        which is imbedded into a sequence of metadata blocks of a        sequence of metadata frames. The payload of such a bitstream        should never be repeated or de-duplicated. Otherwise, the        bitstream would be repeated in parts or partially chopped.        Another example for internally timed data is binary data, like        an executable program. If such binary data is transmitted in        multiple metadata blocks of multiple metadata frames, then the        duplication or de-duplication of metadata blocks would        invalidate the meaning of the binary data.    -   a de-duplicate flag: The de-duplicate flag may ensure that        during transcoding, every block of metadata of a particular id        within the same outbound metadata frame beyond the first that        has this flag set to one may be deleted. The de-duplicate flag        may be set e.g. for loudness data like dialnorm that does not        need to be present multiple times per outbound metadata frame        122.    -   a “now_or_never” flag that indicates that a payload must not be        delayed while transcoding.    -   a PCM processing parameter, referred to as a “tight_coupling”        parameter in Table 2. The PCM processing parameter may e.g. be        used in the context of a PCM-connected transcoder as described        below, in order to inform the PCM-connected transcoder on how to        handle the metadata of a particular metadata frame which is        associated with a particular content frame, in case of a        modification of the samples of the signal comprised within the        content frame. The function of the PCM processing parameter will        be described in further detail below, when describing the        functions of a PCM-connected transcoder.

TABLE 3 Syntax No. of bits Comments payload (id, size) {  for (i = 0; i< size; i++) {   payload_bytes[i]; 8  } }

Table 3 shows the syntax of an example data field 143 of a block 140 ofmetadata.

As outlined above, the bitstream syntax for carrying metadata (i.e. themetadata frame 130 comprising a block 140 of metadata) may definegeneric metadata properties (e.g. comprised in the descriptor 142, i.e.in the payload_config( ) field shown in Table 2). These propertiesenable a simple copying of the metadata from one inbound (i.e. first)bitstream 110 to an outbound (i.e. second) bitstream 120, even if thefirst codec (used for encoding the inbound bitstream 110) and the secondcodec (used for encoding the outbound bitstream 120) use differentframing. The way that the copying of the metadata is done is guided bythe properties comprised within the descriptor 142. The only thing thatmight need to be changed during the transcoding process may be theproperties themselves. However, the modification of the propertiescomprised within the descriptor 142 does not require knowledge about theactual meaning of the metadata comprised within the data field 143 ofthe block 140.

In the following, the example properties shown in Table 2 are describedin more detail. In particular, it is described, how the transcoder 103can make use of one or more of the properties indicated by thedescriptor 142 for performing an efficient transcoding of the metadatacomprised within a block 140 of metadata.

FIGS. 2a and 2b illustrate the use of the timestamp parameter comprisedwithin the descriptor 142 of a block 140 of metadata. In FIG. 2a it isillustrated how the timestamp parameter 201 may be updated by atranscoder 103, when transcoding metadata from the first bitstream 110to the second bitstream 120. In the illustrated example, the timestampparameter 201 indicates the position of a particular sample 202 relativeto the end of the associated content frame 111 (i.e. relative of themost recent sample). As such, the timestamp parameter 201 is indicativeof a “delay” of the sample 202 with respect to the most recent samplecomprised within the content frame 111. In the illustrated example ofFIG. 2, the content frames 121 of the second bitstream 120 have adifferent, in particular a greater, size than the content frames 111 ofthe first bitstream 110. As a result of this, the particular sample 202may be located at a different relative position within the content frame121 of the second bitstream 120 compared to the relative position withinthe content frame 111 of the first bitstream 110. In particular, theparticular sample 202 may exhibit a different “delay” with respect tothe most recent sample comprised within the outbound content frame 121,than with respect to the most recent sample comprised within the inboundcontent frame 111. As a result of this, the timestamp parameter 201comprised within the metadata frame 112 of the first bitstream 110 mayneed to be modified, when inserted into the metadata frame 122 of thesecond bitstream 120, thereby yielding the transcoded timestampparameter 203.

FIG. 2b illustrates the possibility of moving metadata blocks 140 withinthe bitstream 110, 120. This may be useful, in order to smoothen the bitrate of the bitstream 120, subsequent to transcoding. By way of example,the metadata of a particular block 140 in the metadata frame 112 may beassociated with the particular sample 202 in the content frame 111(indicated by the timestamp parameter 211). As outlined above, thelocation of the particular sample 202 may be indicated relative to thelast, i.e. the most recent, sample of the inbound content frame 111. Ifit is not essential that the metadata of the particular block 140arrives directly subsequent to the content frame 121 comprising thesample 202 (as may be indicated by the association flag (referred to as“now-or-never” flag in Table 2), the particular block 140 may be movedby the transcoder to a metadata frame 222 of a content frame 221 whichis subsequent to the content frame 121 comprising the sample 202. Thetranscoder 103 may update the timestamp parameter 213 such that itpoints to the correct sample 202.

In particular, the timestamp parameter 213 may indicate the location ofthe sample 202 relative to the last, i.e. the most recent, sample of theoutbound content frame 221 that the outbound metadata frame 222 whichcomprises the timestamp parameter 213 is associated with. For thispurpose, the timestamp parameter 213 may take on values which exceed thenumber of samples comprised within a content frame 221. In a similarmanner, the timestamp parameter 213 may be configured to take onnegative values. Such negative values could be used to indicate a sample202 which is comprised in a future content frame, i.e. in a contentframe which is subsequent to the content frame 221 associated with themetadata frame 222 comprising the timestamp parameter 213. By doingthis, metadata may be transmitted prior to the one or more samples thatit is associated with (e.g. that it is to be applied to).

As such, the timestamp parameter 211 (possibly in combination with theassociation flag) enables a transcoder 103 to transmit the metadataassociated with a timestamp 211 in a subsequent or preceding metadataframe 222 and adjust the timestamp 213 such that it refers to the samePCM sample 202 (even though after transcoding, the sample 202 is notcomprised in the content frame 221 which is associated with the metadataframe 222 which comprises the particular block 140). As a result ofthis, the transcoder 103 is provided with some flexibility to smoothenthe bit-rate of the second bitstream 120.

It should be noted that—in a similar manner to the transcoder 103—theencoder 101 may be configured to include metadata for a sample into asubsequent metadata frame. As such, the encoder 101 may be configured togenerate a timestamp 213 which points to a sample 202 which is comprisedin a content frame 121 that is not the content frame that the metadataframe comprising the timestamp 213 is associated with.

FIGS. 3a and 3b illustrate possible use cases of the de-duplicate flagindicated by the descriptor 142 of a block 140 of metadata. In theillustrated cases, the content frames 121 of the second bitstream 120represent a higher number of samples (i.e. have a higher frame size)than the content frames 111 of the first bitstream 110. If the framesizes differ, situations may occur where a single content frame 121 ofthe second bitstream 120 comprised samples from more than one contentframe 111 of the first bitstream 110. In such cases, blocks 140 ofmetadata may be available from the more than one metadata frame 112associated with the more than one content frame 111 of the firstbitstream 110. The transcoder 103 has to decide which of the blocks 140of metadata are to be included in the single metadata frame 122 of thesingle content frame 121 of the second bitstream 120. The de-duplicateflag of a particular block 140 may indicate to the transcoder 103 thatthe particular block 140 does not need to be inserted into a metadataframe 122 of the second bitstream 120, if metadata blocks 140 from aplurality of metadata frames 112 of the first bitstream 110 are to bemerged. As such, the transcoder 103 may be configured to drop or ignorethe metadata blocks 140 of additional metadata frames 112, for which thede-duplicate flag is set.

This is illustrated in FIG. 3a , where the outbound content frame 121(i.e. the content frame 121 of the outbound bitstream 120) comprises thesamples of the inbound content frames 111 and 311 (i.e. of the contentframes 111, 311 of the inbound bitstream 110). The transcoder 103 has todecide which of the blocks 140 of the inbound metadata frames 112, 312(i.e. of the metadata frames 112, 312 of the inbound bitstream 110) areto be included into the outbound metadata frame 122 (i.e. of themetadata frame 122 of the outbound bitstream 120) associated with theoutbound content frame 121. In the illustrated example of FIG. 3a , itis assumed that the de-duplicate flag is set at least for the one ormore blocks 140 of the inbound metadata frame 312. As such, thetranscoder 103 may be configured to drop the blocks 140 of the inboundmetadata frame 312.

It should be noted that the de-duplication flag of the one or moreblocks 140 of the inbound metadata frame 112 may also be set. Thetranscoder 103 may be configured to only drop the blocks 140 of a second(or more) metadata frame 312 used to build the outbound metadata frame122. In other words, the transcoder 103 may be configured to considerthe de-duplicate flag only if more than one inbound metadata frame 112is to be considered for generating an outbound metadata frame 122. Assuch, the de-duplicate flag may be used to prevent “duplicates” of aparticular type of metadata block 140, while still ensuring that atleast one metadata block 140 of the particular type is included.

FIG. 3b illustrates an example case, where the de-duplicate flag is notset. In this case, the transcoder 103 may be configured to consider theblocks 140 of the plurality of inbound metadata frames 112 and 312 forbuilding the outbound metadata frame 122. In particular, the transcoder103 may be configured to insert a block 140 from the inbound metadataframe 312 into the outbound metadata frame 122, if the de-duplicate flagis not set (even in situations, where the outbound metadata frame 122 isgenerated from a plurality of inbound metadata frames 112, 312).

The de-duplicate flag may e.g. be used to identify metadata blocks 140which are inserted into a plurality of succeeding metadata frames 112,312 (e.g. into every metadata frame 112, 312 of a bitstream 110). Assuch, the de-duplicate flag enables a transcoder 103 to easily identifymetadata blocks 140 which may be discarded (without the need ofanalyzing the metadata stored in the data field 143 of the metadatablock 140). As a result, the computational complexity for transcodingmetadata is reduced. On the other hand, a de-duplicate flag which is notset indicates that a corresponding block 140 of metadata should not bedropped. This may be used for auxiliary data, in order to ensure thatthe auxiliary data is not dropped, even if a plurality of inboundmetadata frames 112, 312 are transcoded into a single outbound metadataframe 122.

FIGS. 4a and 4b illustrate an example usage of the duplicate flagindicated in the descriptor 142 of a block 140 of metadata. In theillustrated case, the inbound content frame 111 comprises a highernumber of samples (i.e. has a larger frame size) than the outboundcontent frame 121. If the frame sizes differ, situations may occur wherethe samples of a single inbound content frame 111 are comprised withinmore than one outbound content frame 121, 321. As a consequence, thetranscoder 103 receives a single inbound metadata frame 112 and has todecide in which one of the plurality of outbound metadata frames 122,322 to place a particular block 140 of metadata. The duplicate flag maybe used to indicate to the transcoder 130 whether or not to duplicate aparticular block 140 from the inbound metadata frame 112. By setting theduplicate flag, it may be indicated that the metadata comprised withinthe block 140 should be comprised within every outbound metadata frame122, 322, as is shown in FIG. 4a . On the other hand, an unset duplicateflag indicates that the metadata block 140 should only be transmittedonce. As such, the transcoder 103 inserts the block 140 from inboundmetadata frame 112 only into a single one of the plurality of outboundmetadata frames 122, 322 (as illustrated in FIG. 4b ).

As outlined above, the descriptor 142 of a block 140 of metadata may beindicative of an association flag (referred to as the “now_or_never”flag in Table 2). The association flag may indicate that the metadatacomprised within the block 140 may be delayed without impacting thecontent comprised in the associated content frame. As such, the syntaxof the descriptor 142 may enable a transcoder 103 to delay metadata byan arbitrary amount of time, if this is one property of the metadata.This may be indicated by setting the flag now_or_never to 0. Theassociation flag enables the transcoder 103 to transmit the metadatawhich is comprised within the block 140 e.g. when the underlying audiocodec can “afford” the transmission of the metadata, e.g. when thecontent frames comprise silence. One example of metadata which may bedelayed is auxiliary data or binary data, like a firmware upgrade, whichdoes not need to be transmitted along with a particular content frame121.

As described in the context of Table 2, the descriptor 142 of a block140 of metadata may be indicative of or may comprise a priority propertyor a priority parameter. The priority parameter may indicate a relativeimportance of the metadata of a particular block 140 (e.g. relative tothe importance of other blocks 140). A transcoder 103 can decide to onlytranscode a certain number of metadata blocks 140 and to discard allother metadata blocks in the metadata frame 112. This may e.g. berequired when transcoding from a higher bit-rate inbound bitstream 110to a lower bit-rate outbound bitstream 120. The priority parameter mayenable the transcoder 103 to select those blocks 140 of an inboundmetadata frame 112 having the relative highest priorities and to discard(or delay) those blocks 140 having relative lower priorities.

Applications and/or encoders 101 may provide multiple sets of metadatain the same metadata frame 112, each with a different priority. Themultiple sets of metadata may be associated with different qualities ofmetadata. The priority of higher quality metadata may be lower than thepriority of lower quality metadata. As such, the transcoder 103 may beconfigured to degrade the quality of the metadata by considering thepriority parameter. By way of example, if priorities are set in a waysuch that scalability is possible, i.e. every metadata set can beapplied if all metadata sets of the same application of a higherpriority are transmitted, then a transcoder can gracefully degrade thequality of the metadata without having to know about the meaning of themetadata. In particular, the multiple sets of metadata may compriseincremental metadata, i.e. each set of metadata may add some quality tothe set of metadata with the next highest priority. The highest qualityof metadata may then be provided by combining all sets of metadata (fromthe highest priority down to the lowest priority). As such, an inboundmetadata frame 112 may comprise a plurality of blocks 140 of incrementalmetadata, wherein the block 140 of metadata with the highest prioritycomprises a version of the metadata with minimum acceptable quality andwherein the blocks 140 with successively lower priority compriseincremental versions of metadata which allow to incrementally increasethe quality of the metadata. As such, the transcoder 103 may decide onthe quality of metadata which is included into the second bitstream 120by considering the priority parameters of the plurality of blocks 140 ofincremental metadata.

As indicated in the example syntax of a metadata frame 112 shown inTable 1, the metadata frame 130 may comprise a protection field. Theprotection field may be used to enable the decoder 104 to verify whetherthe content of the metadata frame 130 and/or the content of theassociated content frame has been modified and may therefore be invalid.In other words, the protection field may allow a decoder 104 to verifywhether the metadata comprised within a metadata frame 130 and/or withinan associated content frame is trustworthy or not. Table 4 shows anexample syntax of a protection field of a metadata frame 130. Theprotection field may be comprised within the header 131 of the metadataframe 130.

TABLE 4 Syntax No. of bits Comments protection( ) { protection_config_frame; 2  protection_config_history; 2  switch(protection_config_frame) {   case 0:    protection_bits_frame; 0   break;   case 1:    protection_bits_frame; 8    break;   case 2:   protection_bits_frame; 32    break;   case 3:   protection_bits_frame; 128    break;  }  switch(protection_config_history) {   case 0:    protection_bits_history; 0   break;   case 1:    protection_bits_history; 8    break;   case 2:   protection_bits_history; 32    break;   case 3:   protection_bits_history; 128    break;  } }

The semantics of the protection field may be as follows:

-   -   protection_bits_frame may comprise the truncated protection        payload of the current frame (comprising the content frame        and/or the associated metadata frame).    -   protection_bits_history may comprise the truncated protection        payload of the current frame and of the frame(s) before the        current frame (comprising the content frame and/or the        associated metadata frame). An example scheme for securing a        sequence of frames is described in WO2011/015369, the content of        which is incorporated by reference.

As such, the protection field may comprise one or more cryptographicvalues. One of the cryptographic values may be generated based on themetadata comprised within a current metadata frame (comprising theprotection field) and/or based on the content frame associated with thecurrent metadata frame. As such, it may be ensured that an isolatedmetadata frame and/or the associated content frame are not modified.Another one of the cryptographic values may be generated based on themetadata comprised within the current metadata frame and within one ormore preceding metadata frames (as well as on the respective associatedcontent frames). As such, it may be ensured that sequences of contentframes and/or metadata frames are not modified.

A cryptographic value may be determined at an encoder 101 by applying aone-way function to a group of one or more metadata frames 112, 312and/or the associated content frames 111, 311. In particular, acryptographic value may be generated using a key value and acryptographic hash function (the so called one-way function). Inparticular, the cryptographic value may be generated by calculating anHMAC-MD5 (hash message authentication code) value for the data comprisedwithin one or more metadata frames 112, 312 and for the data comprisedwithin the one or more associated content frames 111, 311. Furthermore,the generation of the cryptographic value may comprise truncating of theHMAC-MD5 value, e.g. truncating to 16, 24, 32, 48, 64 or 128 bits. Thetruncation may be beneficial in view of reducing the required overheadfor the cryptographic value in the encoded bitstream 110 comprising themetadata frames 112, 312. It should be noted that other hash functions,such a SHA-1 or SHA-256, may be used instead of MD5. Furthermore, itshould be noted that the encoder 101 may be configured to transmit zerobits of a cryptographic value, i.e. to transmit no cryptographic value,e.g. in situation where no protection of the metadata is required.

In more detail, the cryptographic value for one or more content frames111, 311 and of one or more metadata frames 112, 312 may be determinedby using a cryptographic hash function H(.) and a “secret” key K (alsoreferred to as security key) which is typically padded to the right withextra zeros to the block size of the hash function H(.) to determine ahash message authentication code (HMAC) of the one or more contentframes 111, 311 and of one or more metadata frames 112, 312. Let the IIsign denote a concatenation and the ⊕ sign denote an exclusive or, andthe outer padding opad=0x5c5c5c . . . 5c5c and the inner paddingipad=0x363636 . . . 3636 be constants of the length of the block size ofthe hash function H(.), then the HMAC value of the one or more contentframes 111, 311 and of one or more metadata frames 112, 312 may bewritten asHMAC(m)=H((K⊕opad)∥H((K⊕ipad)∥m)),where m is the combined bit sequence of the one or more content frames111, 311 and of one or more metadata frames 112, 312. The block sizeused with MD5 or SHA-1 or SHA-256 hash functions is typically 512 bits.The size of the output of the HMAC operation is the same as that of theunderlying hash function, i.e. 128 bits in case of MD5 or 160 bits incase of SHA-1.

As such, the protection field may comprise at least two cryptographicvalues

-   -   a frame cryptographic value (referred to as        “protection_bits_frame” in Table 4) which is indicative of the        authenticity of an individual content frame 111 and its        associated metadata frame 112. The frame cryptographic value may        be used to identify whether the data of the individual content        frame 111 and its associated metadata frame 112 has been        changed. The frame cryptographic value may be determined using a        message m which comprises the bit sequence of the individual        content frame 111 and of its associated metadata frame 112 (or        of the payload comprised within the individual content frame 111        and of its associated metadata frame 112.    -   a history cryptographic value (referred to as        “protection_bits_history” in Table 4) which is indicative of the        authenticity of a sequence of at least two content frames 111,        311 and their associated at least two metadata frames 112, 312.        The history cryptographic value may be used to identify whether        the sequence of the at least two content frames 111, 311 and        their associated metadata frames 112, 312 has been changed. The        history cryptographic value may be determined using a message m        which comprises the bit sequence of the at least two content        frames 111, 311 and their associated at least two metadata        frames 112, 312 (or of the payload comprised therein).

As outlined above, the cryptographic values are determined using asecure key K, which is typically known only to the encoder 101 and thedecoder 104. In the present document, it is proposed to enable multiplelevels of trust by allowing the use of different secure keys K providingdifferent levels of trust. By way of example, at least two levels oftrustworthy keys may be provided

-   -   a highly secure key K₁, which may not be disclosed to any        parties outside of the entity which provides the components 101,        103, 104 along a distribution chain 100. Such an entity may be a        provider of the codec systems used along the distribution chain        100 (e.g. Dolby Laboratories). In particular, such an entity may        be the provider of the encoders and the decoders used along the        distribution chain 100. By keeping the highly secure key        undisclosed, it can be ensured that a decoder 104 which renders        the audio signal comprised within the received bitstream 120 can        be certain that the metadata comprised within the metadata        frames 122, 322 of the received bitstream 120 is authentic and        has not been modified in an unauthorized manner along the        distribution chain 100.    -   a moderate secure key K₂, which may be disclosed to other        parties, e.g. parties operating some of the components 101, 103,        104 along the distribution chain 100 (e.g. licensees of the        provider of the codec systems). If the decoder 104 receives a        bitstream 120 which has been protected using the moderate secure        key K₂, the decoder 104 knows that the bitstream 120 comprises        metadata (in the metadata frames 122, 322) which has been        handled in accordance to some policies of the operator of the        distribution chain 100, which may be different from the policies        of the provider of the codec systems (holding the highly secure        key K₁)

An indication of the secure key K used by the encoder 101 may beprovided within a metadata frame 130 (e.g. within the header 131 of themetadata frame 130). This is illustrated in Table 1 which shows thekey_id parameter. The key_id parameter may comprise an index to apre-determined number of secure keys, thereby allowing the decoder 104to determine the secure key K, which was used to determine the one ormore cryptographic values, wherein the one or more cryptographic valuesmay be comprised in the protection( ) field of the metadata frame 130,as shown in Table 4). The decoder 104 may then use the identified securekey to determine the one or more cryptographic values in the same manneras done by the corresponding encoder 101. The cryptographic values whichare determined by the decoder 104 may be referred to as the verificationcryptographic values. The verification cryptographic values are thencompared with the cryptographic values stored in the metadata frame 103.In case of a match, it is confirmed that the individual frame and/orthat the sequence of frames has not been modified. On the other hand, incase of a mismatch, it is confirmed that the individual frame and/orthat the sequence of frames has been modified.

Alternatively or in addition to providing an indication of the securekey within the metadata frame 130, the decoder 104 may be configured todetermine a plurality of sets of verification cryptographic values usinga plurality of pre-determined secure keys known to the decoder 104. Ifone of the sets of verification cryptographic values matches thecryptographic values comprised in the metadata frame 130, the decoder104 knows which secure key has been used and that the individual frameand/or that the sequence of frames has not been modified. On the otherhand, a mismatch for all sets of verification cryptographic valuesindicates that the individual frame and/or that the sequence of frameshas been modified.

Being able to detect which key was used to secure a bitstream 110, 120in decoders 104 and transcoders 103 enables applications to make finergrained decisions on what to do with data of different trustworthiness.Decisions might be different depending on the detected secure key. Inparticular, the highly secure key may be detected, the moderate securekey may be detected or no valid key may be detected and the securitycheck may not pass.

As such, levels of trustworthiness may be provided when using aplurality of different secure keys (which are attached to differentlevels of trust), compared a solution which only uses a single securekey, where only a binary decision can be made on whether data can betrusted or not.

As described in the context of FIG. 1, a distribution chain 100 foraudio content may comprise a transcoder 103 which is configured toconvert an inbound bitstream 110 into an outbound bitstream 120. Thetranscoding performed by the transcoder 103 may relate to thetranscoding from a first audio codec system to a second, possiblydifferent, audio codec system. Alternatively or in addition, thetranscoding may relate to a change of the bit-rate of the outboundbitstream 120 with respect to the bit-rate of the inbound bitstream 110.The transcoder 103 may comprise a decoder for decoding the inboundbitstream 110 into a PCM (pulse code modulated) audio signal.Furthermore, the transcoder 103 may comprise an encoder for encoding thePCM audio signal into the outbound bitstream 120. Such a transcoder 103may be referred to as a “PCM-connected” transcoder, as the one or moredecoders (for decoding the one or more inbound bitstreams 110) areconnected to the one or more encoders (for encoding the one or moreoutbound bitstreams 120) via linear PCM.

The transcoder 103 may be a so called professional transcoder which is adevice used by professional content providers such as broadcasters. Asoutlined above, the transcoder 103 may be configured to accept theinbound bitstream 110 in a first format (e.g. Dolby E) and to transcodethe inbound bitstream 110 into a different format (e.g. Dolby DigitalPlus). Such transcoders 103 typically incorporate one or more decoders(for decoding the inbound bitstream 110) and one or more encoders (forencoding the outbound bitstream 120).

A PCM-connected transcoder may have one or more PCM processing stagesbetween the decoder and the encoder. Loudness leveling is one example ofsuch PCM processing. Other examples of PCM processing are sample rateconversion, channel downmixing, and/or channel upmixing.

Such PCM-connected transcoders 103 pose a challenge with regards toauthenticity, protection and trust issues outlined above. As outlinedabove, an inbound bitstream 110 may comprise metadata frames 112, 312which are protected using one or more cryptographic values (comprisede.g. in the protection field of the metadata frames 112, 312 as shown inTables 1 and 4). A PCM-connected transcoder 103 allows a user to modifyPCM data derived from the content frames 111, 311, thereby possiblyinvalidating the metadata comprised within the associated metadataframes 112, 312, and thereby possibly compromising the trustworthinessof the metadata.

In the present document, a method and a system for ensuring thetrustworthiness of metadata in a transcoder 103 is described. Inparticular, the described method and system allow the trustworthiness ofmetadata comprised in metadata frames 112, 312 to be maintained, evenwhen using a PCM-connected transcoder 103.

FIGS. 5a, 5b, 5c and 5d show example PCM-connected transcoders 503, 513,523, 533, respectively. The transcoders comprise a decoder 504 which isconfigured to convert the inbound bitstream 110 (which comprises asequence of content frames 111 and a sequence of associated metadataframes 112) into PCM data and metadata, respectively. The decoder 504may be configured to verify the correctness of the inbound bitstream 110using the protection scheme outlined above. For this purpose, thedecoder 504 may be aware of some or all of the pre-determined securekeys.

Typically, a decoder 504 provides an unprotected set of PCM data andmetadata (e.g. on a frame by frame basis). In other words, the decoder504 typically decodes each content frame 111 and associated metadataframe 112 and provides the respective set of PCM data and metadatawithout protection. As such, the decoder 504 provides a sequence of setsof PCM data and metadata from a corresponding sequence of content frames111 and metadata frames 112. The sequence of sets of PCM data andmetadata may be modified by the transcoder and may then be passed to anencoder 501 which is configured to convert the sequence of (possiblymodified) sets of PCM data and metadata to the outbound bitstream 120.In this context, the encoder 501 is typically not able to verify whetherthe sequence of (possibly modified) sets of PCM data and metadata hasbeen modified in a sensible manner. In other words, the encoder 501 maynot verify the trustworthiness of the sequence of (possibly modified)sets of PCM data and metadata.

In the present document, it is proposed to enable the decoder 504 toprovide one or more signature values based on one or more sets of PCMdata and metadata, thereby allowing the protection of the PCM connectionbetween the decoder 504 and the encoder 501. The signature values may bedetermined in a similar manner to the cryptographic values, as describedabove. However, the signature values may make use of a message m whichcomprises one or more sets of PCM data and metadata (in contrast to oneor more content frames and associated metadata frames). In particular,the decoder 504 may be configured to determine

-   -   a frame signature value based on an individual set of PCM data        and associated metadata; and    -   a history signature value based on two or more sequential sets        of PCM data and associated metadata.

In other words, within the PCM domain of a PCM-connected transcoder 503(i.e. between the decoder 504 and the encoder 501), the trustworthinessof the content may be “protected” using one or more signatures (alsoreferred to as signature values). The decoder 504 may be configured toproduce one or more signature values as an output. The one or moresignature values may be calculated over the union of PCM data andregular metadata (taken from the content frame) and additional metadata(taken from the associated metadata frame), as produced by the decoder504. As such, for each frame of the inbound bitstream 110, one or moresignature values may be determined based on the decoded sets of PCM dataand metadata. These one or more signature values may be used by thecorresponding encoder 501 to verify whether a received set of PCM dataand metadata has been modified or not, and/or is trustworthy or not.

The encoder 501 accepts the one or more signature values as an input,along with PCM data, regular metadata and the additional metadata. Theencoder 501 may then check the signature values against the other inputs(i.e. against the received set(s) of PCM data and metadata). If theother inputs have been modified/tampered, the signature check will failand the encoder will take appropriate action. The verification of theone or more signature values may be performed at the encoder 501 bydetermining verification signature values based on the received one ormore sets of PCM data and metadata (in a similar manner, as describedfor the cryptographic values).

As such, the trustworthiness of the decoded PCM data (and the associatedmetadata) may be maintained within a PCM-connected transcoder 503 byenabling the decoder 504 to determine one or more signature values basedon the decoded PCM data and the associated metadata and by enabling thecorresponding encoder 501 to verify the authenticity of theto-be-encoded PCM data (and the associated metadata) based on the one ormore signature values. The determination of the one or more signaturevalues and its verification may be performed based on a single or basedon a plurality of leveled security keys K₁ and K₂, as outlined above,wherein the one or more security keys may only be known to the decoder504 and the encoder 501, and are typically unknown to an entityperforming PCM processing on the connection between the decoder 504 andthe encoder 501.

The use of one or more signature values allows the implementation ofvarious use cases as illustrated in FIGS. 5a, 5b, 5c, and 5d . FIG. 5aillustrates a transcoder 503, where no PCM processing is performedbetween the decoder 504 and the encoder 503. As a consequence, theprotected data 510 (comprising one or more sets of PCM data andassociated metadata, as well as one or more associated signatures) isnot modified and the chain of trust is maintained within the transcoder503. As a result, the transcoder 503 of FIG. 5a is configured to receivean inbound bitstream 110 comprising a protected and trusted sequence ofinbound content frames 111 and associated inbound metadata frames 112(also referred to as evolution frames), and to provide an outboundbitstream 120 which comprises a protected and trusted sequence ofoutbound content frames 121 and associated outbound metadata frames 122.This is ensured by protecting the decoded PCM data, the regular metadataand the additional metadata (also referred to as evolution metadata)using one or more signatures. The encoder 501 verifies the one or moresignatures and passes the additional metadata as outbound metadataframes 122 to the outbound bitstream 120. The use case shown in FIG. 5amay e.g. be applicable to the transcoding of a bitstream from a firstbit-rate to a second bit-rate.

FIG. 5b shows a PCM-connected transcoder 513 where the chain of trust isbroken by an untrusted PCM processing stage 505. The PCM processingstage 505 receives the protected data 510 and modifies the data 510. ThePCM processing stage 505 is “untrusted” in that the PCM processing stage505 in not aware of the secure key K used by the decoder 504. As aconsequence, the modified data 511 comprises one or more sets ofmodified PCM data and associated metadata, as well as one or moreinvalid signatures. The encoder 501 is configured to determine theinvalidity of the signatures and may be configured to take appropriateaction. In particular, the encoder 501 may be configured to drop theadditional metadata from the inbound metadata frames 112, therebyproviding an outbound bitstream 120 which only comprises a sequence ofcontent frames 121, but which does not comprise the associated metadataframes 122. By doing this, it is ensured that the transcoder 513 doesnot forward untrusted additional metadata. Furthermore, due to the factthat the bitstream 120 does not comprise metadata frames 122, thebitstream 120 does not comprise the above mentioned cryptographic values(from the protection fields of the metadata frames 122). As such, thebitstream 120 can be identified by a decoder 104 as being untrusted.

As indicated above, the encoder 501 may be configured to drop theadditional metadata from the inbound metadata frames 112, if the one ormore signature values are not valid. As outlined in the context of Table2, the metadata blocks 140 of an inbound metadata frame 112 may beindicative of respective descriptors 142 which describe one or moreproperties of the corresponding metadata blocks 140. One of theseproperties may be the PCM processing parameter (referred to astight_coupling parameter in Table 2). The encoder 501 may be configuredto use the PCM processing parameter of a metadata block 142, in order todecide on whether or not to include the metadata comprised within themetadata block 142 into the outbound bitstream 120. In particular, thePCM processing parameter may indicate to the encoder 501 to includemetadata from a block 140 of the inbound metadata frame 112 into theoutbound bitstream 120, even though the PCM samples of the associatedcontent frame 111 have been modified.

Table 5 shows example semantics of the PCM processing parameter (i.e. ofthe tight_coupling parameter of Table 2). In the illustrated example, avalue “0” of the PCM processing parameter indicates that the payload 143(i.e. the metadata) of a block 140 of metadata should be included intothe outbound bitstream 120 only if no PCM processing occurred, e.g. onlyif the one or more signature values have been verified by the encoder501. On the other hand, a value “3” of the PCM processing parameter mayindicate that the payload 143 of the block 140 should always be includedinto the outbound bitstream 120, even if the PCM samples have beenmodified, e.g. even if the one of more signature values have not beenverified. Furthermore, the PCM processing parameter may take on valueswhich indicate intermediate situations, i.e. the PCM processingparameter may take on values which indicate the PCM processingconditions that need to be met for payload 143 to be included into theoutbound bitstream 120 or which indicate the PCM processing conditions,in case of which the payload 143 is not included into the outboundbitstream 120.

The PCM processing stage 505 may be configured to inform the encoder 501on the processing which has been performed on the PCM samples in the PCMprocessing stage 505. In other words, the PCM processing stage 505 maybe configured to inform the encoder 501 about the PCM processingconditions (e.g. conversion of the sampling rate of the PCM samples,inclusion of a system sound into the PCM samples, modification of themetadata, modification of a channel configuration (e.g. modification ofa mono signal to a stereo signal, or downmixing of a 5.1 multi-channelsignal to a stereo signal), leveling of the loudness, etc.). As such,the encoder 501 may be configured to receive indications of the PCMprocessing conditions from the PCM processing stage 505. Furthermore,the encoder 501 may be configured to process the metadata of a block 140of metadata, based on the received PCM processing conditions and basedon the value of the PCM processing parameter (e.g. in accordance to thesemantics of Table 5).

TABLE 5 0 keep payload only if no PCM processing occurred 1 keep payloadif one or more of the following changes to PCM occurred: the samplingrate has been converted 2 keep payload if one or more of the followingchanges to PCM occurred: Any of the changes mentioned for case “1” abovesystem sounds are mixed into PCM metadata have been modified the channelconfiguration has been changed loudness has been levelled 3 keep payloadregardless of any PCM processing performed

FIG. 5c illustrates the case of a PCM-connected transcoder 523 which isconfigured to perform trusted PCM processing. This may be achieved bycombining the PCM processing stage 506 with an additional re-signingstage 507. For this purpose, a trusted party may be provided with one ormore of the secure keys, thereby enabling the trusted party to re-signthe modified data 511. By way of example, the trusted party may beprovided with the moderate secure key K₂. As a result of this, themodified data 511 may be re-signed (i.e. one or more signature valuesmay be determined based on the modified data 511 using the moderatesecure key K₂), thereby providing the protected modified data 512(comprising a sequence of sets of modified PCM data and associatedmetadata, as well as the one or more new signatures). The encoder 501may be configured to verify the new signature and generate the trustedoutbound bitstream 120 comprising the sequence of content frames 121 andthe associated sequence of metadata frames 122. Furthermore, the encoder501 may be configured to determine that the chain of trust has beenbroken and that a new chain has been created, because the re-signingstage 507 may have used a different secure key (e.g. the moderate securekey K₂) than the decoder 504 (which may have used the highly secure keyK₁).

FIG. 5d shows a block diagram of a PCM-connected transcoder 533 with aPCM processing stage 509 comprised within the encoder 501. Inparticular, the transcoder 533 is configured to maintain a chain oftrust by ensuring that the PCM processing is performed by an entity(e.g. the encoder 501) which is aware of the secure key used by thedecoder 504 to determine the one or more signature values. The encoder501 is configured to verify the one or more signatures of the protecteddata 510. The internal PCM processing stage 508 may then modify thereceived sets of PCM data and associated metadata. Furthermore, theencoder 501 may comprise a metadata update unit 509 which is configuredto update the metadata frames, subject to the modifications performed inthe PCM processing stage 508. In particular, the metadata update unit509 may be configured to determine updated cryptographic values based onthe transcoded content frames 121 and metadata frames 122. The updatedcryptographic values may then be included into the metadata frames 122for communication to the decoder 104.

FIGS. 6a and 6b provides another representation of the transcoders 503,513, 523 and 533, respectively.

In the present document, methods and systems for transcoding metadatahave been described. The methods and systems allow for a transcoding ofmetadata with a reduced computational complexity. In particular, it isproposed to provide descriptors for blocks of metadata, thereby enablinga transcoder to transcode the metadata based on the descriptors only,without the need of analyzing the actual metadata comprised within ablock of metadata. By doing this, the complexity of a transcoder may besignificantly reduced. Furthermore, the present document providesmethods and systems for protecting metadata frames and for protectingPCM data in a PCM-connected transcoder. As a result, it can be ensuredthat a receiver of transcoder metadata is provided with an indication ofthe trustworthiness of the received metadata.

The methods and systems described in the present document may beimplemented as software, firmware and/or hardware. Certain componentsmay e.g. be implemented as software running on a digital signalprocessor or microprocessor. Other components may e.g. be implemented ashardware and or as application specific integrated circuits. The signalsencountered in the described methods and systems may be stored on mediasuch as random access memory or optical storage media. They may betransferred via networks, such as radio networks, satellite networks,wireless networks or wireline networks, e.g. the Internet. Typicaldevices making use of the methods and systems described in the presentdocument are portable electronic devices or other consumer equipmentwhich are used to store and/or render audio signals.

What is claims is:
 1. An encoding method, comprising: receiving aninbound bitstream that includes an inbound content frame and anassociated inbound metadata frame; encoding the inbound content frame toproduce an encoded content frame; generating a protection field for theassociated inbound metadata frame; encoding the associated inboundmetadata frame, including the protection field to produce an encodedassociated metadata frame; and including the encoded content frame andthe encoded associated metadata frame in an output bitstream, wherein:generating the protection field involves generating one or morecryptographic values; at least one of the one or more cryptographicvalues is a frame cryptographic value that is indicative of theauthenticity of the encoded content frame; and the frame cryptographicvalue is generated by applying a one-way function to a group of framesthat includes the inbound content frame and the associated inboundmetadata frame.
 2. The encoding method of claim 1, wherein at least oneof the one or more cryptographic values is generated by applying aone-way function to a group of frames that includes a preceding inboundcontent frame and an inbound metadata frame associated with thepreceding inbound content frame.
 3. The encoding method of claim 2,wherein the protection field includes a history cryptographic value thatis indicative of the authenticity of at least two encoded content framesand at least two encoded metadata frames.
 4. The encoding method ofclaim 1, wherein at least one of the one or more cryptographic values isgenerated using a key value and a cryptographic hash function.
 5. Theencoding method of claim 4, wherein: the key value corresponds to asecure key selected from a plurality of predetermined secure keys; afirst key of plurality of predetermined secure keys corresponds to afirst level of trust; and a second key of the plurality of predeterminedsecure keys corresponds to a second level of trust that is differentfrom the first level of trust.
 6. The encoding method of claim 1,wherein the inbound content frame directly precedes the associatedinbound metadata frame or vice versa.
 7. An encoding apparatus includingone or more hardware elements, the encoding apparatus configured for:receiving an inbound bitstream that includes an inbound content frameand an associated inbound metadata frame; encoding the inbound contentframe to produce an encoded content frame; generating a protection fieldfor the associated inbound metadata frame; encoding the associatedinbound metadata frame, including the protection field to produce anencoded associated metadata frame; and including the encoded contentframe and the encoded associated metadata frame in an output bitstream,wherein: generating the protection field involves generating one or morecryptographic values; at least one of the one or more cryptographicvalues is a frame cryptographic value that is indicative of theauthenticity of the encoded content frame; and the frame cryptographicvalue is generated by applying a one-way function to a group of framesthat includes the inbound content frame and the associated inboundmetadata frame.
 8. The encoding apparatus of claim 7, wherein at leastone of the one or more cryptographic values is generated by applying aone-way function to a group of frames that includes a preceding inboundcontent frame and an inbound metadata frame associated with thepreceding inbound content frame.
 9. The encoding apparatus of claim 8,wherein the protection field includes a history cryptographic value thatis indicative of the authenticity of at least two encoded content framesand at least two encoded metadata frames.
 10. The encoding apparatus ofclaim 7, wherein at least one of the one or more cryptographic values isgenerated using a key value and a cryptographic hash function.
 11. Oneor more non-transitory media having software stored thereon, thesoftware including instructions for performing an encoding method, theencoding method comprising: receiving an inbound bitstream that includesan inbound content frame and an associated inbound metadata frame;encoding the inbound content frame to produce an encoded content frame;generating a protection field for the associated inbound metadata frame;encoding the associated inbound metadata frame, including the protectionfield to produce an encoded associated metadata frame; and including theencoded content frame and the encoded associated metadata frame in anoutput bitstream, wherein: generating the protection field involvesgenerating one or more cryptographic values; at least one of the one ormore cryptographic values is a frame cryptographic value that isindicative of the authenticity of the encoded content frame; and theframe cryptographic value is generated by applying a one-way function toa group of frames that includes the inbound content frame and theassociated inbound metadata frame.
 12. The one or more non-transitorymedia of claim 11, wherein at least one of the one or more cryptographicvalues is generated by applying a one-way function to a group of framesthat includes a preceding inbound content frame and an inbound metadataframe associated with the preceding inbound content frame.
 13. The oneor more non-transitory media of claim 12, wherein the protection fieldincludes a history cryptographic value that is indicative of theauthenticity of at least two encoded content frames and at least twoencoded metadata frames.
 14. The one or more non-transitory media ofclaim 11, wherein at least one of the one or more cryptographic valuesis generated using a key value and a cryptographic hash function.
 15. Amethod for transcoding an inbound bitstream into an outbound bitstream,the inbound bitstream including a first inbound content frame and anassociated first inbound metadata frame, the method comprising: at adecoder: converting the first inbound content frame into a first set ofdecoded Pulse Code Modulated, referred to as PCM, samples; extractingfirst metadata from the first inbound metadata frame; identifying afirst inbound block of metadata in the first metadata, the first inboundblock of metadata being associated with a first inbound descriptorindicative of one or more properties of the first metadata, the one ormore properties including a PCM processing parameter indicative ofwhether the metadata of the first inbound block of metadata is to bediscarded by an encoder, subject to a modification of the first set ofdecoded PCM samples, a modification of the first extracted metadata, ora modification of both the first set of decoded PCM samples and thefirst extracted metadata; generating a frame signature value based onthe first set of decoded PCM samples and the first metadata; and passingthe first set of decoded PCM samples, the first metadata and the framesignature value to a corresponding encoder; and at the encoder:receiving the first set of decoded PCM samples, the first metadata andthe frame signature value; determining whether the frame signature valueis valid for the first set of decoded PCM samples and the firstmetadata; and determining, based at least in part on whether the framesignature value is valid, whether to generate a first outbound currentframe of the outbound bitstream from the first set of decoded PCMsamples and whether to generate, based at least in part on the firstinbound descriptor, an associated first outbound metadata frame of theoutbound bitstream from the first inbound metadata frame.
 16. The methodof claim 15, wherein the inbound bitstream includes a second inboundcontent frame and an associated second inbound metadata frame, themethod further comprising: at the decoder: converting the second inboundcontent frame into a second set of decoded PCM samples; extractingsecond metadata from the second inbound metadata frame; generating ahistory signature value based, at least in part, on the first set ofdecoded PCM samples, the second set of decoded PCM samples, the firstmetadata and the second metadata; and passing the second set of decodedPCM samples, the second metadata and the history signature value to theencoder.
 17. The method of claim 16, further comprising: at the encoder:receiving the second set of decoded PCM samples, the second metadata andthe history signature value; and determining whether the historysignature value is valid for the first set of decoded PCM samples, thesecond set of decoded PCM samples, the first metadata and the secondmetadata.
 18. One or more non-transitory media having software storedthereon, the software including instructions for performing atranscoding method, the transcoding method comprising: at a decoder:converting the first inbound content frame into a first set of decodedPulse Code Modulated, referred to as PCM, samples; extracting firstmetadata from the first inbound metadata frame; identifying a firstinbound block of metadata in the first metadata, the first inbound blockof metadata being associated with a first inbound descriptor indicativeof one or more properties of the first metadata, the one or moreproperties including a PCM processing parameter indicative of whetherthe metadata of the first inbound block of metadata is to be discardedby an encoder, subject to a modification of the first set of decoded PCMsamples, a modification of the first extracted metadata, or amodification of both the first set of decoded PCM samples and the firstextracted metadata; generating a frame signature value based on thefirst set of decoded PCM samples and the first metadata; and passing thefirst set of decoded PCM samples, the first metadata and the framesignature value to a corresponding encoder; and at the encoder:receiving the first set of decoded PCM samples, the first metadata andthe frame signature value; determining whether the frame signature valueis valid for the first set of decoded PCM samples and the firstmetadata; and determining, based at least in part on whether the framesignature value is valid, whether to generate a first outbound currentframe of the outbound bitstream from the first set of decoded PCMsamples and whether to generate, based at least in part on the firstinbound descriptor, an associated first outbound metadata frame of theoutbound bitstream from the first inbound metadata frame.
 19. The one ormore non-transitory media of claim 18, wherein the inbound bitstreamincludes a second inbound content frame and an associated second inboundmetadata frame, the transcoding method further comprising: at thedecoder: converting the second inbound content frame into a second setof decoded PCM samples; extracting second metadata from the secondinbound metadata frame; generating a history signature value based, atleast in part, on the first set of decoded PCM samples, the second setof decoded PCM samples, the first metadata and the second metadata; andpassing the second set of decoded PCM samples, the second metadata andthe history signature value to the encoder.
 20. The one or morenon-transitory media of claim 19, the transcoding method furthercomprising: at the encoder: receiving the second set of decoded PCMsamples, the second metadata and the history signature value; anddetermining whether the history signature value is valid for the firstset of decoded PCM samples, the second set of decoded PCM samples, thefirst metadata and the second metadata.